diff mbox

[RFC,1/2] net: dsa: integrate with SWITCHDEV for HW bridging

Message ID 20150223042220.GA20063@lunn.ch
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Andrew Lunn Feb. 23, 2015, 4:22 a.m. UTC
On Sun, Feb 22, 2015 at 08:07:03PM -0800, Guenter Roeck wrote:
> On 02/22/2015 07:14 PM, Andrew Lunn wrote:
> >On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
> >>On 02/17/2015 11:26 AM, Florian Fainelli wrote:
> >>>In order to support bridging offloads in DSA switch drivers, select
> >>>NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
> >>>NDOs that we are required to implement.
> >>>
> >>>To facilitate the integratation at the DSA driver level, we implement 3
> >>>types of operations:
> >>>
> >>
> >>Hi Florian,
> >>
> >>>
> >>>+/* Return a bitmask of all ports being currently bridged. Note that on
> >>>+ * leave, the mask will still return the bitmask of ports currently bridged,
> >>>+ * prior to port removal, and this is exactly what we want.
> >>>+ */
> >>>+static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
> >>>+{
> >>>+	unsigned int port;
> >>>+	u32 mask = 0;
> >>>+
> >>>+	for (port = 0; port < DSA_MAX_PORTS; port++) {
> >>>+		if (!((1 << port) & ds->phys_port_mask))
> >>>+			continue;
> >>>+
> >>>+		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
> >>>+			mask |= 1 << port;
> >>
> >>Problem is that the function can be called through dsa_slave_netdevice_event
> >>before the slave devices are fully initialized.
> >>
> >>After adding
> >>
> >>+               if (!ds->ports[port]) {
> >>+                       netdev_err(bridge,
> >>+                                  "No ports data for port %d, mask=0x%x\n",
> >>+                                  port, ds->phys_port_mask);
> >>+                       continue;
> >>+               }
> >>
> >>and with some more debug messages added to dsa_switch_setup(), I see the following.
> >>
> >>[   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 1(port1)
> >>[   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 2(port2)
> >>[   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 3(port3)
> >>[   14.472002] br0: No ports data for port 3, mask=0x1e
> >>[   14.472053] br0: No ports data for port 4, mask=0x1e
> >>[   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 4(host2esb)
> >>
> >>This happens if I add the bridge configuration to /etc/network/interfaces instead
> >>of creating the bridge manually. Apparently dsa_switch_setup() is not yet complete
> >>when dsa_slave_netdevice_event is executed to handle a state change on one of its
> >>newly created slave interfaces.
> >>
> >>The relevant information from /etc/network/interfaces is:
> >>
> >>auto br0
> >>
> >>iface port1 inet manual
> >>iface port2 inet manual
> >>
> >>iface br0 inet dhcp
> >>	bridge_ports port1 port2
> >
> >Hi Guenter
> >
> >Does this actually matter? The ports which don't exists yet are not
> >being added to the bridge. The mask will come out correct. What
> >happens when port4 is made a member of the bridge? I expect it
> >works. It is the creation of the interface which triggers hotplug to
> >read interfaces and add the interface to the port.
> >
> 
> 	if (!ds->ports[port])
> 		continue;
> 
> might be an option. However, I am not sure that what you say is correct,
> at least not strictly speaking. dsa_slave_create() returns the created
> slave device, which is added to ds->ports[port] in dsa_switch_setup().
> Since there is no protection in dsa_switch_setup(), there is no guarantee
> that the callback doesn't happen prior to the initialization of
> ds->ports[port]. So the above would leave a race condition, where the
> port being added to the bridge _is_ one for which ds->ports[port] is
> not yet initialized.

Yes, you are correct, there is a race here.


> Protecting the entire slave creation loop in dsa_switch_setup()
> and using register_netdevice() in dsa_slave_create() solves the problem
> as far as I can see, I just don't know if it is an acceptable solution.

Or:


Not tested. But the point being, ensure everything is setup before
calling register_netdev().

	Andrew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Florian Fainelli Feb. 23, 2015, 4:33 a.m. UTC | #1
2015-02-22 20:22 GMT-08:00 Andrew Lunn <andrew@lunn.ch>:
> On Sun, Feb 22, 2015 at 08:07:03PM -0800, Guenter Roeck wrote:
>> On 02/22/2015 07:14 PM, Andrew Lunn wrote:
>> >On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
>> >>On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>> >>>In order to support bridging offloads in DSA switch drivers, select
>> >>>NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>> >>>NDOs that we are required to implement.
>> >>>
>> >>>To facilitate the integratation at the DSA driver level, we implement 3
>> >>>types of operations:
>> >>>
>> >>
>> >>Hi Florian,
>> >>
>> >>>
>> >>>+/* Return a bitmask of all ports being currently bridged. Note that on
>> >>>+ * leave, the mask will still return the bitmask of ports currently bridged,
>> >>>+ * prior to port removal, and this is exactly what we want.
>> >>>+ */
>> >>>+static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>> >>>+{
>> >>>+  unsigned int port;
>> >>>+  u32 mask = 0;
>> >>>+
>> >>>+  for (port = 0; port < DSA_MAX_PORTS; port++) {
>> >>>+          if (!((1 << port) & ds->phys_port_mask))
>> >>>+                  continue;
>> >>>+
>> >>>+          if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>> >>>+                  mask |= 1 << port;
>> >>
>> >>Problem is that the function can be called through dsa_slave_netdevice_event
>> >>before the slave devices are fully initialized.
>> >>
>> >>After adding
>> >>
>> >>+               if (!ds->ports[port]) {
>> >>+                       netdev_err(bridge,
>> >>+                                  "No ports data for port %d, mask=0x%x\n",
>> >>+                                  port, ds->phys_port_mask);
>> >>+                       continue;
>> >>+               }
>> >>
>> >>and with some more debug messages added to dsa_switch_setup(), I see the following.
>> >>
>> >>[   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 1(port1)
>> >>[   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 2(port2)
>> >>[   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 3(port3)
>> >>[   14.472002] br0: No ports data for port 3, mask=0x1e
>> >>[   14.472053] br0: No ports data for port 4, mask=0x1e
>> >>[   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 4(host2esb)
>> >>
>> >>This happens if I add the bridge configuration to /etc/network/interfaces instead
>> >>of creating the bridge manually. Apparently dsa_switch_setup() is not yet complete
>> >>when dsa_slave_netdevice_event is executed to handle a state change on one of its
>> >>newly created slave interfaces.
>> >>
>> >>The relevant information from /etc/network/interfaces is:
>> >>
>> >>auto br0
>> >>
>> >>iface port1 inet manual
>> >>iface port2 inet manual
>> >>
>> >>iface br0 inet dhcp
>> >>    bridge_ports port1 port2
>> >
>> >Hi Guenter
>> >
>> >Does this actually matter? The ports which don't exists yet are not
>> >being added to the bridge. The mask will come out correct. What
>> >happens when port4 is made a member of the bridge? I expect it
>> >works. It is the creation of the interface which triggers hotplug to
>> >read interfaces and add the interface to the port.
>> >
>>
>>       if (!ds->ports[port])
>>               continue;
>>
>> might be an option. However, I am not sure that what you say is correct,
>> at least not strictly speaking. dsa_slave_create() returns the created
>> slave device, which is added to ds->ports[port] in dsa_switch_setup().
>> Since there is no protection in dsa_switch_setup(), there is no guarantee
>> that the callback doesn't happen prior to the initialization of
>> ds->ports[port]. So the above would leave a race condition, where the
>> port being added to the bridge _is_ one for which ds->ports[port] is
>> not yet initialized.
>
> Yes, you are correct, there is a race here.
>
>
>> Protecting the entire slave creation loop in dsa_switch_setup()
>> and using register_netdevice() in dsa_slave_create() solves the problem
>> as far as I can see, I just don't know if it is an acceptable solution.
>
> Or:
>
> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
> index 2173402d87e0..1aa120d6d0e4 100644
> --- a/net/dsa/dsa.c
> +++ b/net/dsa/dsa.c
> @@ -325,8 +325,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
>                                    index, i, pd->port_names[i]);
>                         continue;
>                 }
> -
> -               ds->ports[i] = slave_dev;
>         }
>
>  #ifdef CONFIG_NET_DSA_HWMON
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index f23deadf42a0..d6004072a957 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -669,12 +669,14 @@ dsa_slave_create(struct dsa_switch *ds, struct device *parent,
>                 free_netdev(slave_dev);
>                 return NULL;
>         }
> +       ds->ports[port] = slave_dev;
>
>         ret = register_netdev(slave_dev);
>         if (ret) {
>                 netdev_err(master, "error %d registering interface %s\n",
>                            ret, slave_dev->name);
>                 phy_disconnect(p->phy);
> +               ds->ports[port] = NULL;
>                 free_netdev(slave_dev);
>                 return NULL;
>         }
>
> Not tested. But the point being, ensure everything is setup before
> calling register_netdev().

I prefer that too over using locking around the dsa slave network
device creation, provided that this works for Guenter as well.
Guenter Roeck Feb. 23, 2015, 4:38 a.m. UTC | #2
On 02/22/2015 08:22 PM, Andrew Lunn wrote:
> On Sun, Feb 22, 2015 at 08:07:03PM -0800, Guenter Roeck wrote:
>> On 02/22/2015 07:14 PM, Andrew Lunn wrote:
>>> On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
>>>> On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>>>>> In order to support bridging offloads in DSA switch drivers, select
>>>>> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>>>>> NDOs that we are required to implement.
>>>>>
>>>>> To facilitate the integratation at the DSA driver level, we implement 3
>>>>> types of operations:
>>>>>
>>>>
>>>> Hi Florian,
>>>>
>>>>>
>>>>> +/* Return a bitmask of all ports being currently bridged. Note that on
>>>>> + * leave, the mask will still return the bitmask of ports currently bridged,
>>>>> + * prior to port removal, and this is exactly what we want.
>>>>> + */
>>>>> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>>>>> +{
>>>>> +	unsigned int port;
>>>>> +	u32 mask = 0;
>>>>> +
>>>>> +	for (port = 0; port < DSA_MAX_PORTS; port++) {
>>>>> +		if (!((1 << port) & ds->phys_port_mask))
>>>>> +			continue;
>>>>> +
>>>>> +		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>>>>> +			mask |= 1 << port;
>>>>
>>>> Problem is that the function can be called through dsa_slave_netdevice_event
>>>> before the slave devices are fully initialized.
>>>>
>>>> After adding
>>>>
>>>> +               if (!ds->ports[port]) {
>>>> +                       netdev_err(bridge,
>>>> +                                  "No ports data for port %d, mask=0x%x\n",
>>>> +                                  port, ds->phys_port_mask);
>>>> +                       continue;
>>>> +               }
>>>>
>>>> and with some more debug messages added to dsa_switch_setup(), I see the following.
>>>>
>>>> [   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 1(port1)
>>>> [   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 2(port2)
>>>> [   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 3(port3)
>>>> [   14.472002] br0: No ports data for port 3, mask=0x1e
>>>> [   14.472053] br0: No ports data for port 4, mask=0x1e
>>>> [   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 4(host2esb)
>>>>
>>>> This happens if I add the bridge configuration to /etc/network/interfaces instead
>>>> of creating the bridge manually. Apparently dsa_switch_setup() is not yet complete
>>>> when dsa_slave_netdevice_event is executed to handle a state change on one of its
>>>> newly created slave interfaces.
>>>>
>>>> The relevant information from /etc/network/interfaces is:
>>>>
>>>> auto br0
>>>>
>>>> iface port1 inet manual
>>>> iface port2 inet manual
>>>>
>>>> iface br0 inet dhcp
>>>> 	bridge_ports port1 port2
>>>
>>> Hi Guenter
>>>
>>> Does this actually matter? The ports which don't exists yet are not
>>> being added to the bridge. The mask will come out correct. What
>>> happens when port4 is made a member of the bridge? I expect it
>>> works. It is the creation of the interface which triggers hotplug to
>>> read interfaces and add the interface to the port.
>>>
>>
>> 	if (!ds->ports[port])
>> 		continue;
>>
>> might be an option. However, I am not sure that what you say is correct,
>> at least not strictly speaking. dsa_slave_create() returns the created
>> slave device, which is added to ds->ports[port] in dsa_switch_setup().
>> Since there is no protection in dsa_switch_setup(), there is no guarantee
>> that the callback doesn't happen prior to the initialization of
>> ds->ports[port]. So the above would leave a race condition, where the
>> port being added to the bridge _is_ one for which ds->ports[port] is
>> not yet initialized.
>
> Yes, you are correct, there is a race here.
>
>
>> Protecting the entire slave creation loop in dsa_switch_setup()
>> and using register_netdevice() in dsa_slave_create() solves the problem
>> as far as I can see, I just don't know if it is an acceptable solution.
>
> Or:
>
> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
> index 2173402d87e0..1aa120d6d0e4 100644
> --- a/net/dsa/dsa.c
> +++ b/net/dsa/dsa.c
> @@ -325,8 +325,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
>                                     index, i, pd->port_names[i]);
>                          continue;
>                  }
> -
> -               ds->ports[i] = slave_dev;
>          }
>
>   #ifdef CONFIG_NET_DSA_HWMON
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index f23deadf42a0..d6004072a957 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -669,12 +669,14 @@ dsa_slave_create(struct dsa_switch *ds, struct device *parent,
>                  free_netdev(slave_dev);
>                  return NULL;
>          }
> +       ds->ports[port] = slave_dev;
>
>          ret = register_netdev(slave_dev);
>          if (ret) {
>                  netdev_err(master, "error %d registering interface %s\n",
>                             ret, slave_dev->name);
>                  phy_disconnect(p->phy);
> +               ds->ports[port] = NULL;
>                  free_netdev(slave_dev);
>                  return NULL;
>          }
>
> Not tested. But the point being, ensure everything is setup before
> calling register_netdev().
>

That should work. I'll give it a try.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Fainelli Feb. 23, 2015, 4:43 a.m. UTC | #3
2015-02-22 20:38 GMT-08:00 Guenter Roeck <linux@roeck-us.net>:
> On 02/22/2015 08:22 PM, Andrew Lunn wrote:
>>
>> On Sun, Feb 22, 2015 at 08:07:03PM -0800, Guenter Roeck wrote:
>>>
>>> On 02/22/2015 07:14 PM, Andrew Lunn wrote:
>>>>
>>>> On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
>>>>>
>>>>> On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>>>>>>
>>>>>> In order to support bridging offloads in DSA switch drivers, select
>>>>>> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>>>>>> NDOs that we are required to implement.
>>>>>>
>>>>>> To facilitate the integratation at the DSA driver level, we implement
>>>>>> 3
>>>>>> types of operations:
>>>>>>
>>>>>
>>>>> Hi Florian,
>>>>>
>>>>>>
>>>>>> +/* Return a bitmask of all ports being currently bridged. Note that
>>>>>> on
>>>>>> + * leave, the mask will still return the bitmask of ports currently
>>>>>> bridged,
>>>>>> + * prior to port removal, and this is exactly what we want.
>>>>>> + */
>>>>>> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>>>>>> +{
>>>>>> +       unsigned int port;
>>>>>> +       u32 mask = 0;
>>>>>> +
>>>>>> +       for (port = 0; port < DSA_MAX_PORTS; port++) {
>>>>>> +               if (!((1 << port) & ds->phys_port_mask))
>>>>>> +                       continue;
>>>>>> +
>>>>>> +               if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>>>>>> +                       mask |= 1 << port;
>>>>>
>>>>>
>>>>> Problem is that the function can be called through
>>>>> dsa_slave_netdevice_event
>>>>> before the slave devices are fully initialized.
>>>>>
>>>>> After adding
>>>>>
>>>>> +               if (!ds->ports[port]) {
>>>>> +                       netdev_err(bridge,
>>>>> +                                  "No ports data for port %d,
>>>>> mask=0x%x\n",
>>>>> +                                  port, ds->phys_port_mask);
>>>>> +                       continue;
>>>>> +               }
>>>>>
>>>>> and with some more debug messages added to dsa_switch_setup(), I see
>>>>> the following.
>>>>>
>>>>> [   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>> port 1(port1)
>>>>> [   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>> port 2(port2)
>>>>> [   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>> port 3(port3)
>>>>> [   14.472002] br0: No ports data for port 3, mask=0x1e
>>>>> [   14.472053] br0: No ports data for port 4, mask=0x1e
>>>>> [   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>> port 4(host2esb)
>>>>>
>>>>> This happens if I add the bridge configuration to
>>>>> /etc/network/interfaces instead
>>>>> of creating the bridge manually. Apparently dsa_switch_setup() is not
>>>>> yet complete
>>>>> when dsa_slave_netdevice_event is executed to handle a state change on
>>>>> one of its
>>>>> newly created slave interfaces.
>>>>>
>>>>> The relevant information from /etc/network/interfaces is:
>>>>>
>>>>> auto br0
>>>>>
>>>>> iface port1 inet manual
>>>>> iface port2 inet manual
>>>>>
>>>>> iface br0 inet dhcp
>>>>>         bridge_ports port1 port2
>>>>
>>>>
>>>> Hi Guenter
>>>>
>>>> Does this actually matter? The ports which don't exists yet are not
>>>> being added to the bridge. The mask will come out correct. What
>>>> happens when port4 is made a member of the bridge? I expect it
>>>> works. It is the creation of the interface which triggers hotplug to
>>>> read interfaces and add the interface to the port.
>>>>
>>>
>>>         if (!ds->ports[port])
>>>                 continue;
>>>
>>> might be an option. However, I am not sure that what you say is correct,
>>> at least not strictly speaking. dsa_slave_create() returns the created
>>> slave device, which is added to ds->ports[port] in dsa_switch_setup().
>>> Since there is no protection in dsa_switch_setup(), there is no guarantee
>>> that the callback doesn't happen prior to the initialization of
>>> ds->ports[port]. So the above would leave a race condition, where the
>>> port being added to the bridge _is_ one for which ds->ports[port] is
>>> not yet initialized.
>>
>>
>> Yes, you are correct, there is a race here.
>>
>>
>>> Protecting the entire slave creation loop in dsa_switch_setup()
>>> and using register_netdevice() in dsa_slave_create() solves the problem
>>> as far as I can see, I just don't know if it is an acceptable solution.
>>
>>
>> Or:
>>
>> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
>> index 2173402d87e0..1aa120d6d0e4 100644
>> --- a/net/dsa/dsa.c
>> +++ b/net/dsa/dsa.c
>> @@ -325,8 +325,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int
>> index,
>>                                     index, i, pd->port_names[i]);
>>                          continue;
>>                  }
>> -
>> -               ds->ports[i] = slave_dev;
>>          }
>>
>>   #ifdef CONFIG_NET_DSA_HWMON
>> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
>> index f23deadf42a0..d6004072a957 100644
>> --- a/net/dsa/slave.c
>> +++ b/net/dsa/slave.c
>> @@ -669,12 +669,14 @@ dsa_slave_create(struct dsa_switch *ds, struct
>> device *parent,
>>                  free_netdev(slave_dev);
>>                  return NULL;
>>          }
>> +       ds->ports[port] = slave_dev;
>>
>>          ret = register_netdev(slave_dev);
>>          if (ret) {
>>                  netdev_err(master, "error %d registering interface %s\n",
>>                             ret, slave_dev->name);
>>                  phy_disconnect(p->phy);
>> +               ds->ports[port] = NULL;
>>                  free_netdev(slave_dev);
>>                  return NULL;
>>          }
>>
>> Not tested. But the point being, ensure everything is setup before
>> calling register_netdev().
>>
>
> That should work. I'll give it a try.

BTW, before I re-submit this patch series, do you think we should
introduce a fdb_flush() callback that switch drivers are required to
implement, and invoke it from net/dsa/slave.c upon port join/leave?

Thanks!
Guenter Roeck Feb. 23, 2015, 6:19 a.m. UTC | #4
On 02/22/2015 08:43 PM, Florian Fainelli wrote:
> 2015-02-22 20:38 GMT-08:00 Guenter Roeck <linux@roeck-us.net>:
>> On 02/22/2015 08:22 PM, Andrew Lunn wrote:
>>>
>>> On Sun, Feb 22, 2015 at 08:07:03PM -0800, Guenter Roeck wrote:
>>>>
>>>> On 02/22/2015 07:14 PM, Andrew Lunn wrote:
>>>>>
>>>>> On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
>>>>>>
>>>>>> On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>>>>>>>
>>>>>>> In order to support bridging offloads in DSA switch drivers, select
>>>>>>> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>>>>>>> NDOs that we are required to implement.
>>>>>>>
>>>>>>> To facilitate the integratation at the DSA driver level, we implement
>>>>>>> 3
>>>>>>> types of operations:
>>>>>>>
>>>>>>
>>>>>> Hi Florian,
>>>>>>
>>>>>>>
>>>>>>> +/* Return a bitmask of all ports being currently bridged. Note that
>>>>>>> on
>>>>>>> + * leave, the mask will still return the bitmask of ports currently
>>>>>>> bridged,
>>>>>>> + * prior to port removal, and this is exactly what we want.
>>>>>>> + */
>>>>>>> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>>>>>>> +{
>>>>>>> +       unsigned int port;
>>>>>>> +       u32 mask = 0;
>>>>>>> +
>>>>>>> +       for (port = 0; port < DSA_MAX_PORTS; port++) {
>>>>>>> +               if (!((1 << port) & ds->phys_port_mask))
>>>>>>> +                       continue;
>>>>>>> +
>>>>>>> +               if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>>>>>>> +                       mask |= 1 << port;
>>>>>>
>>>>>>
>>>>>> Problem is that the function can be called through
>>>>>> dsa_slave_netdevice_event
>>>>>> before the slave devices are fully initialized.
>>>>>>
>>>>>> After adding
>>>>>>
>>>>>> +               if (!ds->ports[port]) {
>>>>>> +                       netdev_err(bridge,
>>>>>> +                                  "No ports data for port %d,
>>>>>> mask=0x%x\n",
>>>>>> +                                  port, ds->phys_port_mask);
>>>>>> +                       continue;
>>>>>> +               }
>>>>>>
>>>>>> and with some more debug messages added to dsa_switch_setup(), I see
>>>>>> the following.
>>>>>>
>>>>>> [   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>>> port 1(port1)
>>>>>> [   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>>> port 2(port2)
>>>>>> [   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>>> port 3(port3)
>>>>>> [   14.472002] br0: No ports data for port 3, mask=0x1e
>>>>>> [   14.472053] br0: No ports data for port 4, mask=0x1e
>>>>>> [   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>>> port 4(host2esb)
>>>>>>
>>>>>> This happens if I add the bridge configuration to
>>>>>> /etc/network/interfaces instead
>>>>>> of creating the bridge manually. Apparently dsa_switch_setup() is not
>>>>>> yet complete
>>>>>> when dsa_slave_netdevice_event is executed to handle a state change on
>>>>>> one of its
>>>>>> newly created slave interfaces.
>>>>>>
>>>>>> The relevant information from /etc/network/interfaces is:
>>>>>>
>>>>>> auto br0
>>>>>>
>>>>>> iface port1 inet manual
>>>>>> iface port2 inet manual
>>>>>>
>>>>>> iface br0 inet dhcp
>>>>>>          bridge_ports port1 port2
>>>>>
>>>>>
>>>>> Hi Guenter
>>>>>
>>>>> Does this actually matter? The ports which don't exists yet are not
>>>>> being added to the bridge. The mask will come out correct. What
>>>>> happens when port4 is made a member of the bridge? I expect it
>>>>> works. It is the creation of the interface which triggers hotplug to
>>>>> read interfaces and add the interface to the port.
>>>>>
>>>>
>>>>          if (!ds->ports[port])
>>>>                  continue;
>>>>
>>>> might be an option. However, I am not sure that what you say is correct,
>>>> at least not strictly speaking. dsa_slave_create() returns the created
>>>> slave device, which is added to ds->ports[port] in dsa_switch_setup().
>>>> Since there is no protection in dsa_switch_setup(), there is no guarantee
>>>> that the callback doesn't happen prior to the initialization of
>>>> ds->ports[port]. So the above would leave a race condition, where the
>>>> port being added to the bridge _is_ one for which ds->ports[port] is
>>>> not yet initialized.
>>>
>>>
>>> Yes, you are correct, there is a race here.
>>>
>>>
>>>> Protecting the entire slave creation loop in dsa_switch_setup()
>>>> and using register_netdevice() in dsa_slave_create() solves the problem
>>>> as far as I can see, I just don't know if it is an acceptable solution.
>>>
>>>
>>> Or:
>>>
>>> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
>>> index 2173402d87e0..1aa120d6d0e4 100644
>>> --- a/net/dsa/dsa.c
>>> +++ b/net/dsa/dsa.c
>>> @@ -325,8 +325,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int
>>> index,
>>>                                      index, i, pd->port_names[i]);
>>>                           continue;
>>>                   }
>>> -
>>> -               ds->ports[i] = slave_dev;
>>>           }
>>>
>>>    #ifdef CONFIG_NET_DSA_HWMON
>>> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
>>> index f23deadf42a0..d6004072a957 100644
>>> --- a/net/dsa/slave.c
>>> +++ b/net/dsa/slave.c
>>> @@ -669,12 +669,14 @@ dsa_slave_create(struct dsa_switch *ds, struct
>>> device *parent,
>>>                   free_netdev(slave_dev);
>>>                   return NULL;
>>>           }
>>> +       ds->ports[port] = slave_dev;
>>>
>>>           ret = register_netdev(slave_dev);
>>>           if (ret) {
>>>                   netdev_err(master, "error %d registering interface %s\n",
>>>                              ret, slave_dev->name);
>>>                   phy_disconnect(p->phy);
>>> +               ds->ports[port] = NULL;
>>>                   free_netdev(slave_dev);
>>>                   return NULL;
>>>           }
>>>
>>> Not tested. But the point being, ensure everything is setup before
>>> calling register_netdev().
>>>
>>
>> That should work. I'll give it a try.
>
Looks like it works, with a couple of additional changes.
dsa_slave_create doesn't need to return slave_dev anymore,
and I still need to check if ds->ports[port] is NULL in
dsa_slave_br_port_mask.

> BTW, before I re-submit this patch series, do you think we should
> introduce a fdb_flush() callback that switch drivers are required to
> implement, and invoke it from net/dsa/slave.c upon port join/leave?
>

It also needs to be called for state changes. Right now I call it from
the driver code. I don't really have a strong opinion either way.

One caveat: state updates may be called with soft interrupts disabled,
meaning I can only schedule a state change but not directly execute it.
You don't see that in the sf2 code since you don't use mdio to talk
with the switch. This should probably be documented somewhere.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Lunn Feb. 23, 2015, 1:34 p.m. UTC | #5
> BTW, before I re-submit this patch series, do you think we should
> introduce a fdb_flush() callback that switch drivers are required to
> implement, and invoke it from net/dsa/slave.c upon port join/leave?

Probably a good idea.

We should define exactly what is flushed. Everything? Or only dynamic
entries? The Marvell hardware can also have multicast addresses in its
tables, which can age, or static unicast and multicast entries.

I guess at some point, somebody will start thinking about IGMP
snooping and then we might need separated operations.

	Andrew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Feb. 23, 2015, 2:23 p.m. UTC | #6
On 02/23/2015 05:34 AM, Andrew Lunn wrote:
>> BTW, before I re-submit this patch series, do you think we should
>> introduce a fdb_flush() callback that switch drivers are required to
>> implement, and invoke it from net/dsa/slave.c upon port join/leave?
>
> Probably a good idea.
>
> We should define exactly what is flushed. Everything? Or only dynamic
> entries? The Marvell hardware can also have multicast addresses in its
> tables, which can age, or static unicast and multicast entries.
>

I currently use ATU command 110 (flush all non-static entries in a
particular FID). I see means to flush either all entries or all
non-static entries, but no means to only flush unicast or multicast
entries. Does any of the standards distinguish between learned unicast
and multicast addresses ? Flushing those selectively might be a
challenge.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Lunn Feb. 23, 2015, 4:01 p.m. UTC | #7
> I currently use ATU command 110 (flush all non-static entries in a
> particular FID). I see means to flush either all entries or all
> non-static entries, but no means to only flush unicast or multicast
> entries. Does any of the standards distinguish between learned unicast
> and multicast addresses ? Flushing those selectively might be a
> challenge.

You might need to walk the table and flush records individually if you
are only interested in one type.

We should also consider do we need to make these flush operations
atomic with respect to other operations? Do we need to disable
learning, flush, change the port STP status, and then enable learning?

	  Andrew

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Fainelli Feb. 23, 2015, 6:05 p.m. UTC | #8
On 23/02/15 08:01, Andrew Lunn wrote:
>> I currently use ATU command 110 (flush all non-static entries in a
>> particular FID). I see means to flush either all entries or all
>> non-static entries, but no means to only flush unicast or multicast
>> entries. Does any of the standards distinguish between learned unicast
>> and multicast addresses ? Flushing those selectively might be a
>> challenge.

Lucky you, on Broadcom switches you have to issue an ARL search, get the
results (there are all valid MAC entries, fortunately), and invalidate
the entries one by one for your particular ports of interest, there is
no "flush all non-static entries".

> 
> You might need to walk the table and flush records individually if you
> are only interested in one type.
> 
> We should also consider do we need to make these flush operations
> atomic with respect to other operations? Do we need to disable
> learning, flush, change the port STP status, and then enable learning?

I think we may have to do this to guarantee no race conditions between
flushing the switch's FDB, although it would look like only "joining" a
bridge needs to be a more controlled operation, on leave we can probably
just leave the bridge, flush entries and the switch port will start
learning new MAC addresses, right?

Alternatively, would not setting a very low aging timeout and
maintaining HW learning still allow us to simplify these operations?
Guenter Roeck Feb. 23, 2015, 6:35 p.m. UTC | #9
On Mon, Feb 23, 2015 at 10:05:41AM -0800, Florian Fainelli wrote:
> On 23/02/15 08:01, Andrew Lunn wrote:
> >> I currently use ATU command 110 (flush all non-static entries in a
> >> particular FID). I see means to flush either all entries or all
> >> non-static entries, but no means to only flush unicast or multicast
> >> entries. Does any of the standards distinguish between learned unicast
> >> and multicast addresses ? Flushing those selectively might be a
> >> challenge.
> 
> Lucky you, on Broadcom switches you have to issue an ARL search, get the
> results (there are all valid MAC entries, fortunately), and invalidate
> the entries one by one for your particular ports of interest, there is
> no "flush all non-static entries".
> 
> > 
> > You might need to walk the table and flush records individually if you
> > are only interested in one type.
> > 
> > We should also consider do we need to make these flush operations
> > atomic with respect to other operations? Do we need to disable
> > learning, flush, change the port STP status, and then enable learning?
> 
Wonder what if anything RSTP specifies for flush operation details.

> I think we may have to do this to guarantee no race conditions between
> flushing the switch's FDB, although it would look like only "joining" a
> bridge needs to be a more controlled operation, on leave we can probably
> just leave the bridge, flush entries and the switch port will start
> learning new MAC addresses, right?
> 
> Alternatively, would not setting a very low aging timeout and
> maintaining HW learning still allow us to simplify these operations?

That is what STP specifies. With RSTP, the expectation is that the database
is flushed immediately on port status changes. Also, the minimum aging
period on Marvell switches is 15 seconds, which is way too long for RSTP.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrey Volkov Feb. 25, 2015, 1:43 p.m. UTC | #10
Gunter, Florian,

Le 23/02/2015 19:35, Guenter Roeck a wrote :
> On Mon, Feb 23, 2015 at 10:05:41AM -0800, Florian Fainelli wrote:
>> On 23/02/15 08:01, Andrew Lunn wrote:
>>>> I currently use ATU command 110 (flush all non-static entries in a
>>>> particular FID). I see means to flush either all entries or all
>>>> non-static entries, but no means to only flush unicast or multicast
>>>> entries. Does any of the standards distinguish between learned unicast
>>>> and multicast addresses ? Flushing those selectively might be a
>>>> challenge.
>> Lucky you, on Broadcom switches you have to issue an ARL search, get the
>> results (there are all valid MAC entries, fortunately), and invalidate
>> the entries one by one for your particular ports of interest, there is
>> no "flush all non-static entries".
>>
>>> You might need to walk the table and flush records individually if you
>>> are only interested in one type.
>>>
>>> We should also consider do we need to make these flush operations
>>> atomic with respect to other operations? Do we need to disable
>>> learning, flush, change the port STP status, and then enable learning?
> Wonder what if anything RSTP specifies for flush operation details.
>
>> I think we may have to do this to guarantee no race conditions between
>> flushing the switch's FDB, although it would look like only "joining" a
>> bridge needs to be a more controlled operation, on leave we can probably
>> just leave the bridge, flush entries and the switch port will start
>> learning new MAC addresses, right?
>>
>> Alternatively, would not setting a very low aging timeout and
>> maintaining HW learning still allow us to simplify these operations?
> That is what STP specifies. With RSTP, the expectation is that the database
> is flushed immediately on port status changes. Also, the minimum aging
> period on Marvell switches is 15 seconds, which is way too long for RSTP.
>
> Guenter
>
I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
(I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will 
contain something useful at the enter time. Also I've look through yours patches and I haven't
seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?

Btw your current FID implementation contain funny security problem: same ports in the different chips,
interconnected by DSA, will have same FID and as result they will treated as bridged together by
internal switch logic...

Andrey
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Feb. 25, 2015, 2:25 p.m. UTC | #11
Andrey,

On 02/25/2015 05:43 AM, Andrey Volkov wrote:
> Gunter, Florian,
>
> Le 23/02/2015 19:35, Guenter Roeck a wrote :
>> On Mon, Feb 23, 2015 at 10:05:41AM -0800, Florian Fainelli wrote:
>>> On 23/02/15 08:01, Andrew Lunn wrote:
>>>>> I currently use ATU command 110 (flush all non-static entries in a
>>>>> particular FID). I see means to flush either all entries or all
>>>>> non-static entries, but no means to only flush unicast or multicast
>>>>> entries. Does any of the standards distinguish between learned unicast
>>>>> and multicast addresses ? Flushing those selectively might be a
>>>>> challenge.
>>> Lucky you, on Broadcom switches you have to issue an ARL search, get the
>>> results (there are all valid MAC entries, fortunately), and invalidate
>>> the entries one by one for your particular ports of interest, there is
>>> no "flush all non-static entries".
>>>
>>>> You might need to walk the table and flush records individually if you
>>>> are only interested in one type.
>>>>
>>>> We should also consider do we need to make these flush operations
>>>> atomic with respect to other operations? Do we need to disable
>>>> learning, flush, change the port STP status, and then enable learning?
>> Wonder what if anything RSTP specifies for flush operation details.
>>
>>> I think we may have to do this to guarantee no race conditions between
>>> flushing the switch's FDB, although it would look like only "joining" a
>>> bridge needs to be a more controlled operation, on leave we can probably
>>> just leave the bridge, flush entries and the switch port will start
>>> learning new MAC addresses, right?
>>>
>>> Alternatively, would not setting a very low aging timeout and
>>> maintaining HW learning still allow us to simplify these operations?
>> That is what STP specifies. With RSTP, the expectation is that the database
>> is flushed immediately on port status changes. Also, the minimum aging
>> period on Marvell switches is 15 seconds, which is way too long for RSTP.
>>
>> Guenter
>>
> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
> contain something useful at the enter time. Also I've look through yours patches and I haven't

Does removing a port from a fid clean up the entries associated with it
in the database ?

> seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?
>
Not me. That would be difficult to test without real hardware.

The above suggests that you have a HW bridge implementation for Marvell chips as well.
Would it make sense to merge our implementations, or just use yours if it is better ?

> Btw your current FID implementation contain funny security problem: same ports in the different chips,
> interconnected by DSA, will have same FID and as result they will treated as bridged together by
> internal switch logic...
>
You mean if multiple switch chips are used ? Those ports are configured to only send
data to the CPU port. Doesn't that take care of the problem ? Granted, I have not
looked into multi-chip applications, so there may well be some problems. Maybe
it is possible to merge a chip ID into the fid to solve it.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrey Volkov Feb. 25, 2015, 4:05 p.m. UTC | #12
Le 25/02/2015 15:25, Guenter Roeck wrote:
> Andrey,
> 
> On 02/25/2015 05:43 AM, Andrey Volkov wrote:
>> Gunter, Florian,
>>
>> Le 23/02/2015 19:35, Guenter Roeck a wrote :
>>> On Mon, Feb 23, 2015 at 10:05:41AM -0800, Florian Fainelli wrote:
>>>> On 23/02/15 08:01, Andrew Lunn wrote:
>>>>>> I currently use ATU command 110 (flush all non-static entries in a
>>>>>> particular FID). I see means to flush either all entries or all
>>>>>> non-static entries, but no means to only flush unicast or multicast
>>>>>> entries. Does any of the standards distinguish between learned unicast
>>>>>> and multicast addresses ? Flushing those selectively might be a
>>>>>> challenge.
>>>> Lucky you, on Broadcom switches you have to issue an ARL search, get the
>>>> results (there are all valid MAC entries, fortunately), and invalidate
>>>> the entries one by one for your particular ports of interest, there is
>>>> no "flush all non-static entries".
>>>>
>>>>> You might need to walk the table and flush records individually if you
>>>>> are only interested in one type.
>>>>>
>>>>> We should also consider do we need to make these flush operations
>>>>> atomic with respect to other operations? Do we need to disable
>>>>> learning, flush, change the port STP status, and then enable learning?
>>> Wonder what if anything RSTP specifies for flush operation details.
>>>
>>>> I think we may have to do this to guarantee no race conditions between
>>>> flushing the switch's FDB, although it would look like only "joining" a
>>>> bridge needs to be a more controlled operation, on leave we can probably
>>>> just leave the bridge, flush entries and the switch port will start
>>>> learning new MAC addresses, right?
>>>>
>>>> Alternatively, would not setting a very low aging timeout and
>>>> maintaining HW learning still allow us to simplify these operations?
>>> That is what STP specifies. With RSTP, the expectation is that the database
>>> is flushed immediately on port status changes. Also, the minimum aging
>>> period on Marvell switches is 15 seconds, which is way too long for RSTP.
>>>
>>> Guenter
>>>
>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>> contain something useful at the enter time. Also I've look through yours patches and I haven't
> 
> Does removing a port from a fid clean up the entries associated with it
> in the database ?
> 
It doesn't, sorry that I didn't described it clearly: I've tried to point to that fact that 
changing FID will cause 2 things:
 - learn/discard/... process for all following packets will begin from scratch (as it should be)
 - we could start (potentially) slow database cleanup process in dedicated thread/work, and we may not
   care about appearing of new ATU rules for the removed port, since packets now will be rejected 
   by port's logic.

>> seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?
>>
> Not me. That would be difficult to test without real hardware.
Not a problem for me :), I've already monster switch containing three different types of Marvell's chips 
just before me on my table.

> 
> The above suggests that you have a HW bridge implementation for Marvell chips as well.
> Would it make sense to merge our implementations, or just use yours if it is better ?
I've implemented same thing almost by same way, so for me it will be easer to rebase on top of your jobs,
especially due to the fact that I've enforced to use very old kernel: proprietary binary blobs...

> 
>> Btw your current FID implementation contain funny security problem: same ports in the different chips,
>> interconnected by DSA, will have same FID and as result they will treated as bridged together by
>> internal switch logic...
>>
> You mean if multiple switch chips are used ? Those ports are configured to only send
> data to the CPU port. Doesn't that take care of the problem ? Granted, I have not
> looked into multi-chip applications, so there may well be some problems.

My current project is to implement support of something like:

       .----------.    .--------.
       |  CPU1    |    |  CPU2  |
 .DSA--o (master) |    |        |
 |     '----------'    'o-------'
 |                  .---'
 | .-----.       .--o--.       .-----.
 '-o SW1 o--DSA--o SW2 o==DSA==o SW3 |
   '-----'       '-----'       '-----'
     |             |              |
   ports         ports          ports

Where SW2 and SW3 are interconnected by "trunk", everything managed by CPU1,
some ports of SW1-SW3 are bridged with CPU2, some with CPU1, and some bridged 
independently of CPUs. Also, as I told before, all SWs are from 
different chips families, so I'm using all, except 88e6060 and 6171, Marvell's drivers.

> Maybe
> it is possible to merge a chip ID into the fid to solve it.
Will not work IMHO, since to support interswitch bridges, some ports must have common id's,
so we should have some enumeration management at level of the DSA tree.
I've already implemented it as a free running counter, but implementation is wrong, terrible
and must be redesigned by hlists or alike.

Regards
Andrey
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Feb. 25, 2015, 5:41 p.m. UTC | #13
Andrey,

On Wed, Feb 25, 2015 at 05:05:12PM +0100, Andrey Volkov wrote:
> > 
> > Does removing a port from a fid clean up the entries associated with it
> > in the database ?
> > 
> It doesn't, sorry that I didn't described it clearly: I've tried to point to that fact that 
> changing FID will cause 2 things:
>  - learn/discard/... process for all following packets will begin from scratch (as it should be)
>  - we could start (potentially) slow database cleanup process in dedicated thread/work, and we may not
>    care about appearing of new ATU rules for the removed port, since packets now will be rejected 
>    by port's logic.
> 
Any idea what happens if a packet is received which has an fdb entry
pointing to port X, which was just removed from the bridge group ?

> >> seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?
> >>
> > Not me. That would be difficult to test without real hardware.
> Not a problem for me :), I've already monster switch containing three different types of Marvell's chips 
> just before me on my table.
> 

Lucky (or unlucky ;-) you.

> > 
> > The above suggests that you have a HW bridge implementation for Marvell chips as well.
> > Would it make sense to merge our implementations, or just use yours if it is better ?
> I've implemented same thing almost by same way, so for me it will be easer to rebase on top of your jobs,
> especially due to the fact that I've enforced to use very old kernel: proprietary binary blobs...
> 

Can you by any chance share your code, and or do you plan
to submit it ?

I'll have to look into multi-bridge implementations at some point
in the future, so that would help a lot.

> > 
> >> Btw your current FID implementation contain funny security problem: same ports in the different chips,
> >> interconnected by DSA, will have same FID and as result they will treated as bridged together by
> >> internal switch logic...
> >>
> > You mean if multiple switch chips are used ? Those ports are configured to only send
> > data to the CPU port. Doesn't that take care of the problem ? Granted, I have not
> > looked into multi-chip applications, so there may well be some problems.
> 
> My current project is to implement support of something like:
> 
>        .----------.    .--------.
>        |  CPU1    |    |  CPU2  |
>  .DSA--o (master) |    |        |
>  |     '----------'    'o-------'
>  |                  .---'
>  | .-----.       .--o--.       .-----.
>  '-o SW1 o--DSA--o SW2 o==DSA==o SW3 |
>    '-----'       '-----'       '-----'
>      |             |              |
>    ports         ports          ports
> 
> Where SW2 and SW3 are interconnected by "trunk", everything managed by CPU1,
> some ports of SW1-SW3 are bridged with CPU2, some with CPU1, and some bridged 
> independently of CPUs. Also, as I told before, all SWs are from 
> different chips families, so I'm using all, except 88e6060 and 6171, Marvell's drivers.
> 
Sounds like a lot of fun, especially if/when both CPUs start messing
with switch configuration.

> > Maybe
> > it is possible to merge a chip ID into the fid to solve it.
> Will not work IMHO, since to support interswitch bridges, some ports must have common id's,
> so we should have some enumeration management at level of the DSA tree.
> I've already implemented it as a free running counter, but implementation is wrong, terrible
> and must be redesigned by hlists or alike.
> 
Maybe use ida to get a well defined id for each bridge group touched
by dsa ? This id could then be used by the driver to identify a bridge.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrey Volkov Feb. 26, 2015, 1:18 a.m. UTC | #14
Guenter,

Le 25/02/2015 18:41, Guenter Roeck wrote :
> Andrey,
>
> On Wed, Feb 25, 2015 at 05:05:12PM +0100, Andrey Volkov wrote:
>>> Does removing a port from a fid clean up the entries associated with it
>>> in the database ?
>>>
>> It doesn't, sorry that I didn't described it clearly: I've tried to point to that fact that
>> changing FID will cause 2 things:
>>   - learn/discard/... process for all following packets will begin from scratch (as it should be)
>>   - we could start (potentially) slow database cleanup process in dedicated thread/work, and we may not
>>     care about appearing of new ATU rules for the removed port, since packets now will be rejected
>>     by port's logic.
>>
> Any idea what happens if a packet is received which has an fdb entry
> pointing to port X, which was just removed from the bridge group ?
I have some ideas (looks like it must be filtered out), but not sure,
I will do some experiments tomorrow.

>>>> seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?
>>>>
>>> Not me. That would be difficult to test without real hardware.
>> Not a problem for me :), I've already monster switch containing three different types of Marvell's chips
>> just before me on my table.
>>
> Lucky (or unlucky ;-) you.
M-m-m, how about undocumented 'PPU enable' 14'th bit and other such
gifts :D?

>>> The above suggests that you have a HW bridge implementation for Marvell chips as well.
>>> Would it make sense to merge our implementations, or just use yours if it is better ?
>> I've implemented same thing almost by same way, so for me it will be easer to rebase on top of your jobs,
>> especially due to the fact that I've enforced to use very old kernel: proprietary binary blobs...
>>
> Can you by any chance share your code, and or do you plan
> to submit it ?
Yes, sure, I plan to start submitting it in the first half of the March.

>
> I'll have to look into multi-bridge implementations at some point
> in the future, so that would help a lot.
>
>>>> Btw your current FID implementation contain funny security problem: same ports in the different chips,
>>>> interconnected by DSA, will have same FID and as result they will treated as bridged together by
>>>> internal switch logic...
>>>>
>>> You mean if multiple switch chips are used ? Those ports are configured to only send
>>> data to the CPU port. Doesn't that take care of the problem ? Granted, I have not
>>> looked into multi-chip applications, so there may well be some problems.
>> My current project is to implement support of something like:
>>
>>         .----------.    .--------.
>>         |  CPU1    |    |  CPU2  |
>>   .DSA--o (master) |    |        |
>>   |     '----------'    'o-------'
>>   |                  .---'
>>   | .-----.       .--o--.       .-----.
>>   '-o SW1 o--DSA--o SW2 o==DSA==o SW3 |
>>     '-----'       '-----'       '-----'
>>       |             |              |
>>     ports         ports          ports
>>
>> Where SW2 and SW3 are interconnected by "trunk", everything managed by CPU1,
>> some ports of SW1-SW3 are bridged with CPU2, some with CPU1, and some bridged
>> independently of CPUs. Also, as I told before, all SWs are from
>> different chips families, so I'm using all, except 88e6060 and 6171, Marvell's drivers.
>>
> Sounds like a lot of fun, especially if/when both CPUs start messing
> with switch configuration.
Yeah, the toy is really interesting and powerful, and fortunately not so terrible:
CPU2  connected like "normal" client, meanwhile mixture of trunk/bridge/normal ports
connected through mixture of PHYs still "little bit" disturbing me :).

>
>>> Maybe
>>> it is possible to merge a chip ID into the fid to solve it.
>> Will not work IMHO, since to support interswitch bridges, some ports must have common id's,
>> so we should have some enumeration management at level of the DSA tree.
>> I've already implemented it as a free running counter, but implementation is wrong, terrible
>> and must be redesigned by hlists or alike.
>>
> Maybe use ida to get a well defined id for each bridge group touched
> by dsa ? This id could then be used by the driver to identify a bridge.
Hmm, probably you are right, I need thinking about it little bit more.
Just for any case: these shared ids must be unique not only for bridges,
but for the Marvell's trunks (hidden from user space) and normal ports too.

Florian, I suspect that Broadcom have same document's policy as Marvell,
so you can not tell us too much, but is the trunk/multiswitch stuff will be
interesting for your too?

Andrey

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrey Volkov Feb. 27, 2015, 5:09 p.m. UTC | #15
Gunter,

Sorry with response delay, I very was busy yesterday

Le 25/02/2015 15:25, Guenter Roeck a écrit :
> Andrey,

------- snip ------- 

>>>
>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>> contain something useful at the enter time. Also I've look through yours patches and I haven't
> 
> Does removing a port from a fid clean up the entries associated with it
> in the database ?

I've checked what happened when port changed its FID: switch logic block traffic to it
immediately, as far as I can see, meanwhile record still exists in the bridge database,
it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it 
will happened.

> 
>> seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?
>>
> Not me. That would be difficult to test without real hardware.
> 
> The above suggests that you have a HW bridge implementation for Marvell chips as well.
> Would it make sense to merge our implementations, or just use yours if it is better ?
> 
>> Btw your current FID implementation contain funny security problem: same ports in the different chips,
>> interconnected by DSA, will have same FID and as result they will treated as bridged together by
>> internal switch logic...
>>
> You mean if multiple switch chips are used ? Those ports are configured to only send
> data to the CPU port. Doesn't that take care of the problem ? Granted, I have not
> looked into multi-chip applications, so there may well be some problems. Maybe
> it is possible to merge a chip ID into the fid to solve it.
> 
> Thanks,
> Guenter
> 

Regards,
Andrey

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Feb. 28, 2015, 7:53 a.m. UTC | #16
On 02/27/2015 09:09 AM, Andrey Volkov wrote:
> Gunter,
>
> Sorry with response delay, I very was busy yesterday
>
> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>> Andrey,
>
> ------- snip -------
>
>>>>
>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>
>> Does removing a port from a fid clean up the entries associated with it
>> in the database ?
>
> I've checked what happened when port changed its FID: switch logic block traffic to it
> immediately, as far as I can see, meanwhile record still exists in the bridge database,
> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
> will happened.
>
Hmm - interesting. I assume you mean updating port registers 5 and 6 ?

Different question: For 6185, did you write a new driver or extend an existing one ?
I found that it is quite similar to 6131, and that adding support for it to the 6131
driver should be straightforward.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrey Volkov March 2, 2015, 2:38 p.m. UTC | #17
On 28/02/2015 08:53, Guenter Roeck wrote:
> On 02/27/2015 09:09 AM, Andrey Volkov wrote:
>> Gunter,
>>
>> Sorry with response delay, I very was busy yesterday
>>
>> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>>> Andrey,
>>
>> ------- snip -------
>>
>>>>>
>>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>>
>>> Does removing a port from a fid clean up the entries associated with it
>>> in the database ?
>>
>> I've checked what happened when port changed its FID: switch logic block traffic to it
>> immediately, as far as I can see, meanwhile record still exists in the bridge database,
>> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
>> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
>> will happened.
>>
> Hmm - interesting. I assume you mean updating port registers 5 and 6 ?
Yes sure, it's reason why we must disable the port before changing the FID.

> 
> Different question: For 6185, did you write a new driver or extend an existing one ?
> I found that it is quite similar to 6131, and that adding support for it to the 6131
> driver should be straightforward.
Yes again :), and 88E6097 have same core as 6123_61_65. Difference in both cases only in the number
of supported ports, and it was main reason why hardcoded port's number was unacceptable for me, difference is 
large enough: for ex. 88e6123 have only 3 ports, but 88E6097 - 11.

> 
> Thanks,
> Guenter
> 
> 
--
Regards
Andrey
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck March 2, 2015, 2:49 p.m. UTC | #18
On 03/02/2015 06:38 AM, Andrey Volkov wrote:
>
> On 28/02/2015 08:53, Guenter Roeck wrote:
>> On 02/27/2015 09:09 AM, Andrey Volkov wrote:
>>> Gunter,
>>>
>>> Sorry with response delay, I very was busy yesterday
>>>
>>> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>>>> Andrey,
>>>
>>> ------- snip -------
>>>
>>>>>>
>>>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>>>
>>>> Does removing a port from a fid clean up the entries associated with it
>>>> in the database ?
>>>
>>> I've checked what happened when port changed its FID: switch logic block traffic to it
>>> immediately, as far as I can see, meanwhile record still exists in the bridge database,
>>> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
>>> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
>>> will happened.
>>>
>> Hmm - interesting. I assume you mean updating port registers 5 and 6 ?
> Yes sure, it's reason why we must disable the port before changing the FID.
>
Yes, I think we'll need to do that once we use the bits in register 5.

>>
>> Different question: For 6185, did you write a new driver or extend an existing one ?
>> I found that it is quite similar to 6131, and that adding support for it to the 6131
>> driver should be straightforward.
> Yes again :), and 88E6097 have same core as 6123_61_65. Difference in both cases only in the number
> of supported ports, and it was main reason why hardcoded port's number was unacceptable for me, difference is
> large enough: for ex. 88e6123 have only 3 ports, but 88E6097 - 11.
>

I have a patch set to change the number of ports to a variable in the 6131 driver.
Want me to submit it now ? Though I guess you must have pretty much the same,
so we can also use your approach. Let me know.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrey Volkov March 2, 2015, 3:27 p.m. UTC | #19
On 02/03/2015 15:49, Guenter Roeck wrote:
> On 03/02/2015 06:38 AM, Andrey Volkov wrote:
>>
>> On 28/02/2015 08:53, Guenter Roeck wrote:
>>> On 02/27/2015 09:09 AM, Andrey Volkov wrote:
>>>> Gunter,
>>>>
>>>> Sorry with response delay, I very was busy yesterday
>>>>
>>>> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>>>>> Andrey,
>>>>
>>>> ------- snip -------
>>>>
>>>>>>>
>>>>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>>>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>>>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>>>>
>>>>> Does removing a port from a fid clean up the entries associated with it
>>>>> in the database ?
>>>>
>>>> I've checked what happened when port changed its FID: switch logic block traffic to it
>>>> immediately, as far as I can see, meanwhile record still exists in the bridge database,
>>>> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
>>>> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
>>>> will happened.
>>>>
>>> Hmm - interesting. I assume you mean updating port registers 5 and 6 ?
>> Yes sure, it's reason why we must disable the port before changing the FID.
>>
> Yes, I think we'll need to do that once we use the bits in register 5.
> 
>>>
>>> Different question: For 6185, did you write a new driver or extend an existing one ?
>>> I found that it is quite similar to 6131, and that adding support for it to the 6131
>>> driver should be straightforward.
>> Yes again :), and 88E6097 have same core as 6123_61_65. Difference in both cases only in the number
>> of supported ports, and it was main reason why hardcoded port's number was unacceptable for me, difference is
>> large enough: for ex. 88e6123 have only 3 ports, but 88E6097 - 11.
>>
> 
> I have a patch set to change the number of ports to a variable in the 6131 driver.
> Want me to submit it now ? Though I guess you must have pretty much the same,
> so we can also use your approach. Let me know.

I think that better to start from my patches: they are more generic and have support of sysfs
(should be useful for "MII over ethernet"). Also IMO it will be better if we continue exchange/prereview
our patches in more narrow mail list, since I do not want to pollute netdev by useless discussions about drafts.
Objections/suggestions?

--
Regards
Andrey
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck March 2, 2015, 5:16 p.m. UTC | #20
On 03/02/2015 07:27 AM, Andrey Volkov wrote:
> On 02/03/2015 15:49, Guenter Roeck wrote:
>> On 03/02/2015 06:38 AM, Andrey Volkov wrote:
>>>
>>> On 28/02/2015 08:53, Guenter Roeck wrote:
>>>> On 02/27/2015 09:09 AM, Andrey Volkov wrote:
>>>>> Gunter,
>>>>>
>>>>> Sorry with response delay, I very was busy yesterday
>>>>>
>>>>> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>>>>>> Andrey,
>>>>>
>>>>> ------- snip -------
>>>>>
>>>>>>>>
>>>>>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>>>>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>>>>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>>>>>
>>>>>> Does removing a port from a fid clean up the entries associated with it
>>>>>> in the database ?
>>>>>
>>>>> I've checked what happened when port changed its FID: switch logic block traffic to it
>>>>> immediately, as far as I can see, meanwhile record still exists in the bridge database,
>>>>> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
>>>>> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
>>>>> will happened.
>>>>>
>>>> Hmm - interesting. I assume you mean updating port registers 5 and 6 ?
>>> Yes sure, it's reason why we must disable the port before changing the FID.
>>>
>> Yes, I think we'll need to do that once we use the bits in register 5.
>>
>>>>
>>>> Different question: For 6185, did you write a new driver or extend an existing one ?
>>>> I found that it is quite similar to 6131, and that adding support for it to the 6131
>>>> driver should be straightforward.
>>> Yes again :), and 88E6097 have same core as 6123_61_65. Difference in both cases only in the number
>>> of supported ports, and it was main reason why hardcoded port's number was unacceptable for me, difference is
>>> large enough: for ex. 88e6123 have only 3 ports, but 88E6097 - 11.
>>>
>>
>> I have a patch set to change the number of ports to a variable in the 6131 driver.
>> Want me to submit it now ? Though I guess you must have pretty much the same,
>> so we can also use your approach. Let me know.
>
> I think that better to start from my patches: they are more generic and have support of sysfs
> (should be useful for "MII over ethernet"). Also IMO it will be better if we continue exchange/prereview

Sure, no problem. Only concern I have is that your patches don't seem to be available
in public, or maybe I missed the reference to it.

> our patches in more narrow mail list, since I do not want to pollute netdev by useless discussions about drafts.
> Objections/suggestions?
>
Sure, no problem, though personally I have no issues with the discussions
or with submitting draft patches, and I did not have the impression that
they are useless.

My patches are all in my repository at kernel.org; it is fine with me to keep
them (only) there if submitting drafts to netdev is considered pollution.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrey Volkov March 4, 2015, 10:07 a.m. UTC | #21
Gunter,

On 02/03/2015 18:16, Guenter Roeck wrote:
> On 03/02/2015 07:27 AM, Andrey Volkov wrote:
>> On 02/03/2015 15:49, Guenter Roeck wrote:
>>> On 03/02/2015 06:38 AM, Andrey Volkov wrote:
>>>>
>>>> On 28/02/2015 08:53, Guenter Roeck wrote:
>>>>> On 02/27/2015 09:09 AM, Andrey Volkov wrote:
>>>>>> Gunter,
>>>>>>
>>>>>> Sorry with response delay, I very was busy yesterday
>>>>>>
>>>>>> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>>>>>>> Andrey,
>>>>>>
>>>>>> ------- snip -------
>>>>>>
>>>>>>>>>
>>>>>>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>>>>>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>>>>>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>>>>>>
>>>>>>> Does removing a port from a fid clean up the entries associated with it
>>>>>>> in the database ?
>>>>>>
>>>>>> I've checked what happened when port changed its FID: switch logic block traffic to it
>>>>>> immediately, as far as I can see, meanwhile record still exists in the bridge database,
>>>>>> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
>>>>>> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
>>>>>> will happened.
>>>>>>
>>>>> Hmm - interesting. I assume you mean updating port registers 5 and 6 ?
>>>> Yes sure, it's reason why we must disable the port before changing the FID.
>>>>
>>> Yes, I think we'll need to do that once we use the bits in register 5.
>>>
>>>>>
>>>>> Different question: For 6185, did you write a new driver or extend an existing one ?
>>>>> I found that it is quite similar to 6131, and that adding support for it to the 6131
>>>>> driver should be straightforward.
>>>> Yes again :), and 88E6097 have same core as 6123_61_65. Difference in both cases only in the number
>>>> of supported ports, and it was main reason why hardcoded port's number was unacceptable for me, difference is
>>>> large enough: for ex. 88e6123 have only 3 ports, but 88E6097 - 11.
>>>>
>>>
>>> I have a patch set to change the number of ports to a variable in the 6131 driver.
>>> Want me to submit it now ? Though I guess you must have pretty much the same,
>>> so we can also use your approach. Let me know.
>>
>> I think that better to start from my patches: they are more generic and have support of sysfs
>> (should be useful for "MII over ethernet"). Also IMO it will be better if we continue exchange/prereview
> 
> Sure, no problem. Only concern I have is that your patches don't seem to be available
> in public, or maybe I missed the reference to it.
No I didn't publish them yet, sorry, I plan to submit them at the end of next week,
as I've wrote before, but if this date is late for you, then yes, 
I think that we could begin from your patches.

> 
>> our patches in more narrow mail list, since I do not want to pollute netdev by useless discussions about drafts.
>> Objections/suggestions?
>>
> Sure, no problem, though personally I have no issues with the discussions
> or with submitting draft patches, and I did not have the impression that
> they are useless.
Ok, I've just mean to simplify discussion, especially that David told about his 
fresh hardware switches related project, so ok this mail list then then this mail list.

> 
> My patches are all in my repository at kernel.org; it is fine with me to keep
> them (only) there if submitting drafts to netdev is considered pollution.
Nice, I'll keep this thing in mind.

Thank you
Andrey
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 2173402d87e0..1aa120d6d0e4 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -325,8 +325,6 @@  dsa_switch_setup(struct dsa_switch_tree *dst, int index,
                                   index, i, pd->port_names[i]);
                        continue;
                }
-
-               ds->ports[i] = slave_dev;
        }
 
 #ifdef CONFIG_NET_DSA_HWMON
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index f23deadf42a0..d6004072a957 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -669,12 +669,14 @@  dsa_slave_create(struct dsa_switch *ds, struct device *parent,
                free_netdev(slave_dev);
                return NULL;
        }
+       ds->ports[port] = slave_dev;
 
        ret = register_netdev(slave_dev);
        if (ret) {
                netdev_err(master, "error %d registering interface %s\n",
                           ret, slave_dev->name);
                phy_disconnect(p->phy);
+               ds->ports[port] = NULL;
                free_netdev(slave_dev);
                return NULL;
        }