diff mbox series

[net] net: dsa: Do not leave DSA master with NULL netdev_ops

Message ID 20200504201806.27192-1-f.fainelli@gmail.com
State Accepted
Delegated to: David Miller
Headers show
Series [net] net: dsa: Do not leave DSA master with NULL netdev_ops | expand

Commit Message

Florian Fainelli May 4, 2020, 8:18 p.m. UTC
When ndo_get_phys_port_name() for the CPU port was added we introduced
an early check for when the DSA master network device in
dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When
we perform the teardown operation in dsa_master_ndo_teardown() we would
not be checking that cpu_dp->orig_ndo_ops was successfully allocated and
non-NULL initialized.

With network device drivers such as virtio_net, this leads to a NPD as
soon as the DSA switch hanging off of it gets torn down because we are
now assigning the virtio_net device's netdev_ops a NULL pointer.

Fixes: da7b9e9b00d4 ("net: dsa: Add ndo_get_phys_port_name() for CPU port")
Reported-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 net/dsa/master.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Vladimir Oltean May 4, 2020, 8:34 p.m. UTC | #1
Hi Florian,

On Mon, 4 May 2020 at 23:19, Florian Fainelli <f.fainelli@gmail.com> wrote:
>
> When ndo_get_phys_port_name() for the CPU port was added we introduced
> an early check for when the DSA master network device in
> dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When
> we perform the teardown operation in dsa_master_ndo_teardown() we would
> not be checking that cpu_dp->orig_ndo_ops was successfully allocated and
> non-NULL initialized.
>
> With network device drivers such as virtio_net, this leads to a NPD as
> soon as the DSA switch hanging off of it gets torn down because we are
> now assigning the virtio_net device's netdev_ops a NULL pointer.
>
> Fixes: da7b9e9b00d4 ("net: dsa: Add ndo_get_phys_port_name() for CPU port")
> Reported-by: Allen Pais <allen.pais@oracle.com>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---

The fix makes complete sense.
But on another note, if we don't overlay an ndo_get_phys_port_name if
the master already has one, doesn't that render the entire mechanism
of having a reliable way for user space to determine the CPU port
number pointless?

>  net/dsa/master.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/dsa/master.c b/net/dsa/master.c
> index b5c535af63a3..a621367c6e8c 100644
> --- a/net/dsa/master.c
> +++ b/net/dsa/master.c
> @@ -289,7 +289,8 @@ static void dsa_master_ndo_teardown(struct net_device *dev)
>  {
>         struct dsa_port *cpu_dp = dev->dsa_ptr;
>
> -       dev->netdev_ops = cpu_dp->orig_ndo_ops;
> +       if (cpu_dp->orig_ndo_ops)
> +               dev->netdev_ops = cpu_dp->orig_ndo_ops;
>         cpu_dp->orig_ndo_ops = NULL;
>  }
>
> --
> 2.20.1
>

Regards,
-Vladimir
Florian Fainelli May 4, 2020, 8:40 p.m. UTC | #2
On 5/4/2020 1:34 PM, Vladimir Oltean wrote:
> Hi Florian,
> 
> On Mon, 4 May 2020 at 23:19, Florian Fainelli <f.fainelli@gmail.com> wrote:
>>
>> When ndo_get_phys_port_name() for the CPU port was added we introduced
>> an early check for when the DSA master network device in
>> dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When
>> we perform the teardown operation in dsa_master_ndo_teardown() we would
>> not be checking that cpu_dp->orig_ndo_ops was successfully allocated and
>> non-NULL initialized.
>>
>> With network device drivers such as virtio_net, this leads to a NPD as
>> soon as the DSA switch hanging off of it gets torn down because we are
>> now assigning the virtio_net device's netdev_ops a NULL pointer.
>>
>> Fixes: da7b9e9b00d4 ("net: dsa: Add ndo_get_phys_port_name() for CPU port")
>> Reported-by: Allen Pais <allen.pais@oracle.com>
>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>> ---
> 
> The fix makes complete sense.
> But on another note, if we don't overlay an ndo_get_phys_port_name if
> the master already has one, doesn't that render the entire mechanism
> of having a reliable way for user space to determine the CPU port
> number pointless?

For the CPU port I would consider ndo_get_phys_port_name() to be more
best effort than an absolute need unlike the user facing ports, where
this is necessary for a variety of actions (e.g.: determining
queues/port numbers etc.) which is why there was no overlay being done
in that case. There is not a good way to cascade the information other
than do something like pX.Y and defining what the X and Y are, what do
you think?
Vladimir Oltean May 4, 2020, 8:49 p.m. UTC | #3
On Mon, 4 May 2020 at 23:40, Florian Fainelli <f.fainelli@gmail.com> wrote:
>
>
>
> On 5/4/2020 1:34 PM, Vladimir Oltean wrote:
> > Hi Florian,
> >
> > On Mon, 4 May 2020 at 23:19, Florian Fainelli <f.fainelli@gmail.com> wrote:
> >>
> >> When ndo_get_phys_port_name() for the CPU port was added we introduced
> >> an early check for when the DSA master network device in
> >> dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When
> >> we perform the teardown operation in dsa_master_ndo_teardown() we would
> >> not be checking that cpu_dp->orig_ndo_ops was successfully allocated and
> >> non-NULL initialized.
> >>
> >> With network device drivers such as virtio_net, this leads to a NPD as
> >> soon as the DSA switch hanging off of it gets torn down because we are
> >> now assigning the virtio_net device's netdev_ops a NULL pointer.
> >>
> >> Fixes: da7b9e9b00d4 ("net: dsa: Add ndo_get_phys_port_name() for CPU port")
> >> Reported-by: Allen Pais <allen.pais@oracle.com>
> >> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> >> ---
> >
> > The fix makes complete sense.
> > But on another note, if we don't overlay an ndo_get_phys_port_name if
> > the master already has one, doesn't that render the entire mechanism
> > of having a reliable way for user space to determine the CPU port
> > number pointless?
>
> For the CPU port I would consider ndo_get_phys_port_name() to be more
> best effort than an absolute need unlike the user facing ports, where
> this is necessary for a variety of actions (e.g.: determining
> queues/port numbers etc.) which is why there was no overlay being done
> in that case. There is not a good way to cascade the information other
> than do something like pX.Y and defining what the X and Y are, what do
> you think?
> --
> Florian

For the CPU/master port I am not actually sure who is the final
consumer of the ndo_get_phys_port_name, I thought it is simply
informational, with the observation that it may be unreliable in
transmitting that information over.
Speaking of which, if "informational" is the only purpose, could this
not be used?

devlink port | grep "flavour cpu"
pci/0000:00:00.5/4: type notset flavour cpu port 4
spi/spi2.0/4: type notset flavour cpu port 4
spi/spi2.1/4: type notset flavour cpu port 4

Thanks,
-Vladimir
Florian Fainelli May 4, 2020, 9:03 p.m. UTC | #4
On 5/4/2020 1:49 PM, Vladimir Oltean wrote:
> On Mon, 4 May 2020 at 23:40, Florian Fainelli <f.fainelli@gmail.com> wrote:
>>
>>
>>
>> On 5/4/2020 1:34 PM, Vladimir Oltean wrote:
>>> Hi Florian,
>>>
>>> On Mon, 4 May 2020 at 23:19, Florian Fainelli <f.fainelli@gmail.com> wrote:
>>>>
>>>> When ndo_get_phys_port_name() for the CPU port was added we introduced
>>>> an early check for when the DSA master network device in
>>>> dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When
>>>> we perform the teardown operation in dsa_master_ndo_teardown() we would
>>>> not be checking that cpu_dp->orig_ndo_ops was successfully allocated and
>>>> non-NULL initialized.
>>>>
>>>> With network device drivers such as virtio_net, this leads to a NPD as
>>>> soon as the DSA switch hanging off of it gets torn down because we are
>>>> now assigning the virtio_net device's netdev_ops a NULL pointer.
>>>>
>>>> Fixes: da7b9e9b00d4 ("net: dsa: Add ndo_get_phys_port_name() for CPU port")
>>>> Reported-by: Allen Pais <allen.pais@oracle.com>
>>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>>>> ---
>>>
>>> The fix makes complete sense.
>>> But on another note, if we don't overlay an ndo_get_phys_port_name if
>>> the master already has one, doesn't that render the entire mechanism
>>> of having a reliable way for user space to determine the CPU port
>>> number pointless?
>>
>> For the CPU port I would consider ndo_get_phys_port_name() to be more
>> best effort than an absolute need unlike the user facing ports, where
>> this is necessary for a variety of actions (e.g.: determining
>> queues/port numbers etc.) which is why there was no overlay being done
>> in that case. There is not a good way to cascade the information other
>> than do something like pX.Y and defining what the X and Y are, what do
>> you think?
>> --
>> Florian
> 
> For the CPU/master port I am not actually sure who is the final
> consumer of the ndo_get_phys_port_name, I thought it is simply
> informational, with the observation that it may be unreliable in
> transmitting that information over.
> Speaking of which, if "informational" is the only purpose, could this
> not be used?

Yes, I had not considered devlink would expose that information,
ndo_phys_port_name() is there now though and since it is exposed through
sysfs so reverting would be an ABI breakage.

> 
> devlink port | grep "flavour cpu"
> pci/0000:00:00.5/4: type notset flavour cpu port 4
> spi/spi2.0/4: type notset flavour cpu port 4
> spi/spi2.1/4: type notset flavour cpu port 4
> 
> Thanks,
> -Vladimir
>
Allen May 5, 2020, 6:07 a.m. UTC | #5
> When ndo_get_phys_port_name() for the CPU port was added we introduced
> an early check for when the DSA master network device in
> dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When
> we perform the teardown operation in dsa_master_ndo_teardown() we would
> not be checking that cpu_dp->orig_ndo_ops was successfully allocated and
> non-NULL initialized.
> 
> With network device drivers such as virtio_net, this leads to a NPD as
> soon as the DSA switch hanging off of it gets torn down because we are
> now assigning the virtio_net device's netdev_ops a NULL pointer.
> 
> Fixes: da7b9e9b00d4 ("net: dsa: Add ndo_get_phys_port_name() for CPU port")
> Reported-by: Allen Pais <allen.pais@oracle.com>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Tested-by: Allen Pais <allen.pais@oracle.com>

Thank you Florain.
> ---
>   net/dsa/master.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/dsa/master.c b/net/dsa/master.c
> index b5c535af63a3..a621367c6e8c 100644
> --- a/net/dsa/master.c
> +++ b/net/dsa/master.c
> @@ -289,7 +289,8 @@ static void dsa_master_ndo_teardown(struct net_device *dev)
>   {
>   	struct dsa_port *cpu_dp = dev->dsa_ptr;
>   
> -	dev->netdev_ops = cpu_dp->orig_ndo_ops;
> +	if (cpu_dp->orig_ndo_ops)
> +		dev->netdev_ops = cpu_dp->orig_ndo_ops;
>   	cpu_dp->orig_ndo_ops = NULL;
>   }
>   
>
David Miller May 7, 2020, 12:32 a.m. UTC | #6
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Mon,  4 May 2020 13:18:06 -0700

> When ndo_get_phys_port_name() for the CPU port was added we introduced
> an early check for when the DSA master network device in
> dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When
> we perform the teardown operation in dsa_master_ndo_teardown() we would
> not be checking that cpu_dp->orig_ndo_ops was successfully allocated and
> non-NULL initialized.
> 
> With network device drivers such as virtio_net, this leads to a NPD as
> soon as the DSA switch hanging off of it gets torn down because we are
> now assigning the virtio_net device's netdev_ops a NULL pointer.
> 
> Fixes: da7b9e9b00d4 ("net: dsa: Add ndo_get_phys_port_name() for CPU port")
> Reported-by: Allen Pais <allen.pais@oracle.com>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Applied and queued up for -stable, thanks Florian.
diff mbox series

Patch

diff --git a/net/dsa/master.c b/net/dsa/master.c
index b5c535af63a3..a621367c6e8c 100644
--- a/net/dsa/master.c
+++ b/net/dsa/master.c
@@ -289,7 +289,8 @@  static void dsa_master_ndo_teardown(struct net_device *dev)
 {
 	struct dsa_port *cpu_dp = dev->dsa_ptr;
 
-	dev->netdev_ops = cpu_dp->orig_ndo_ops;
+	if (cpu_dp->orig_ndo_ops)
+		dev->netdev_ops = cpu_dp->orig_ndo_ops;
 	cpu_dp->orig_ndo_ops = NULL;
 }