diff mbox series

[net-next,10/14] net: vlan: Propagate MC addresses with VID through switchdev

Message ID 20190116200102.2749-11-f.fainelli@gmail.com
State Changes Requested
Delegated to: David Miller
Headers show
Series net: dsa: management mode for bcm_sf2 | expand

Commit Message

Florian Fainelli Jan. 16, 2019, 8 p.m. UTC
The VLAN real device could be an Ethernet switch port and that switch
might have VLAN filtering globally enabled (because of a bridge
requesting VLAN filtering on the switch on another port) and so when
programming multicast addresses, we need the multicast filter
programming to be aware of the correct VLAN ID as well.

Ethernet drivers that do not implement switchdev_port_{add,del}
operations and do not specifically check for SWITCHDEV_OBJ_ID_HOST_MDB
are not affected by that change.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 net/8021q/vlan_dev.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

Comments

Ido Schimmel Jan. 17, 2019, 2:49 p.m. UTC | #1
On Wed, Jan 16, 2019 at 12:00:58PM -0800, Florian Fainelli wrote:
> The VLAN real device could be an Ethernet switch port and that switch
> might have VLAN filtering globally enabled (because of a bridge
> requesting VLAN filtering on the switch on another port) and so when
> programming multicast addresses, we need the multicast filter
> programming to be aware of the correct VLAN ID as well.

This looks like a quirk of a specific device. How bad is it to patch the
driver to add a multicast address for every configured VLAN?

Also, I think it's weird that we have one API to program address and a
completely different API (via switchdev) to program address+VID pairs.
Extending current API might make more sense.
Florian Fainelli Jan. 17, 2019, 7:12 p.m. UTC | #2
On 1/17/19 6:49 AM, Ido Schimmel wrote:
> On Wed, Jan 16, 2019 at 12:00:58PM -0800, Florian Fainelli wrote:
>> The VLAN real device could be an Ethernet switch port and that switch
>> might have VLAN filtering globally enabled (because of a bridge
>> requesting VLAN filtering on the switch on another port) and so when
>> programming multicast addresses, we need the multicast filter
>> programming to be aware of the correct VLAN ID as well.
> 
> This looks like a quirk of a specific device. How bad is it to patch the
> driver to add a multicast address for every configured VLAN?

There is at least another driver that can be benefit from that which is
cpsw, if I understand Ivan's use case correctly.

If there is a ndo_set_rx_mode() function implemented by the virtual
device, and that does call dev_{mc,uc}_sync(master, dev), then this
means that you do want to be able to filter UC and MC addresses. If we
added the entire class D range of multicast addresses to the switch's
MDB, that would not be filtering, we would be passing up everything to
the stack and let it filter in software because there is no multicast
socket listening on that address.

> 
> Also, I think it's weird that we have one API to program address and a
> completely different API (via switchdev) to program address+VID pairs.
> Extending current API might make more sense.
> 

Do you mean ndo_set_rx_mode() and dev_mc_sync()? That is what Ivan
proposed doing not so long ago here:

https://www.spinics.net/lists/netdev/msg537424.html

but that is IMHO wasting storage space, because the kernel is
maintaining the address lists, and now also needs to gain knowledge
about the VID. With up to 4K - 2 VLAN interfaces per switch port, this
bloats the memory footprint, we arguably still need to maintain those
address lists anyway...

The reason why I chose switchdev here is because:

- this is mostly relevant for switch devices, not so much for NICs (it
seems), if it was, they would have solved the problem by now

- this allows to have an unified path from the switch driver perspective
to program MDB addresses targeting the CPU/management port, no need to
have X different ways of doing the same operation
Ido Schimmel Jan. 21, 2019, 9:13 a.m. UTC | #3
On Thu, Jan 17, 2019 at 11:12:24AM -0800, Florian Fainelli wrote:
> On 1/17/19 6:49 AM, Ido Schimmel wrote:
> > On Wed, Jan 16, 2019 at 12:00:58PM -0800, Florian Fainelli wrote:
> >> The VLAN real device could be an Ethernet switch port and that switch
> >> might have VLAN filtering globally enabled (because of a bridge
> >> requesting VLAN filtering on the switch on another port) and so when
> >> programming multicast addresses, we need the multicast filter
> >> programming to be aware of the correct VLAN ID as well.
> > 
> > This looks like a quirk of a specific device. How bad is it to patch the
> > driver to add a multicast address for every configured VLAN?
> 
> There is at least another driver that can be benefit from that which is
> cpsw, if I understand Ivan's use case correctly.

I understand and I have no argument against the need for this. I just
think we should use a different mechanism than switchdev.

> If there is a ndo_set_rx_mode() function implemented by the virtual
> device, and that does call dev_{mc,uc}_sync(master, dev), then this
> means that you do want to be able to filter UC and MC addresses. If we
> added the entire class D range of multicast addresses to the switch's
> MDB, that would not be filtering, we would be passing up everything to
> the stack and let it filter in software because there is no multicast
> socket listening on that address.

OK.

> > Also, I think it's weird that we have one API to program address and a
> > completely different API (via switchdev) to program address+VID pairs.
> > Extending current API might make more sense.
> > 
> 
> Do you mean ndo_set_rx_mode() and dev_mc_sync()? That is what Ivan
> proposed doing not so long ago here:
> 
> https://www.spinics.net/lists/netdev/msg537424.html
> 
> but that is IMHO wasting storage space, because the kernel is
> maintaining the address lists, and now also needs to gain knowledge
> about the VID. With up to 4K - 2 VLAN interfaces per switch port, this
> bloats the memory footprint, we arguably still need to maintain those
> address lists anyway...

I didn't review Ivan's changes in details, but it makes much more sense
to me to simply extend the current Rx filtering mechanism than to use a
completely unrelated infrastructure.

> 
> The reason why I chose switchdev here is because:
> 
> - this is mostly relevant for switch devices, not so much for NICs (it
> seems), if it was, they would have solved the problem by now

I don't see any use of switchdev APIs in the driver Ivan is patching.
The cover letter doesn't indicate anything about it either.

> - this allows to have an unified path from the switch driver perspective
> to program MDB addresses targeting the CPU/management port, no need to
> have X different ways of doing the same operation

But it's not the same thing. Allowing certain packets to ingress the
device is not the same as having the device send them to the CPU. We
have VLAN filters as well. Allowing VID X to ingress does not mean that
we trap each packet with this VID to CPU.
Ilias Apalodimas Jan. 21, 2019, 9:17 a.m. UTC | #4
Hi Ido,
> > 
> > The reason why I chose switchdev here is because:
> > 
> > - this is mostly relevant for switch devices, not so much for NICs (it
> > seems), if it was, they would have solved the problem by now
> 
> I don't see any use of switchdev APIs in the driver Ivan is patching.
> The cover letter doesn't indicate anything about it either.

There were RFCs for it a few months ago https://patchwork.ozlabs.org/cover/929367/

We decided that rewriting the driver instead of adding switchdev support on
the current one is cleaner and preferred, so we'll be posting a new driver for
this at some point (most of the work is already done).

> 
> > - this allows to have an unified path from the switch driver perspective
> > to program MDB addresses targeting the CPU/management port, no need to
> > have X different ways of doing the same operation
> 
> But it's not the same thing. Allowing certain packets to ingress the
> device is not the same as having the device send them to the CPU. We
> have VLAN filters as well. Allowing VID X to ingress does not mean that
> we trap each packet with this VID to CPU.
/Ilias
Ivan Khoronzhuk Jan. 22, 2019, 11:30 a.m. UTC | #5
On Mon, Jan 21, 2019 at 09:13:01AM +0000, Ido Schimmel wrote:
>On Thu, Jan 17, 2019 at 11:12:24AM -0800, Florian Fainelli wrote:
>> On 1/17/19 6:49 AM, Ido Schimmel wrote:
>> > On Wed, Jan 16, 2019 at 12:00:58PM -0800, Florian Fainelli wrote:
>> >> The VLAN real device could be an Ethernet switch port and that switch
>> >> might have VLAN filtering globally enabled (because of a bridge
>> >> requesting VLAN filtering on the switch on another port) and so when
>> >> programming multicast addresses, we need the multicast filter
>> >> programming to be aware of the correct VLAN ID as well.
>> >
>> > This looks like a quirk of a specific device. How bad is it to patch the
>> > driver to add a multicast address for every configured VLAN?
>>
>> There is at least another driver that can be benefit from that which is
>> cpsw, if I understand Ivan's use case correctly.
>
>I understand and I have no argument against the need for this. I just
>think we should use a different mechanism than switchdev.
>
>> If there is a ndo_set_rx_mode() function implemented by the virtual
>> device, and that does call dev_{mc,uc}_sync(master, dev), then this
>> means that you do want to be able to filter UC and MC addresses. If we
>> added the entire class D range of multicast addresses to the switch's
>> MDB, that would not be filtering, we would be passing up everything to
>> the stack and let it filter in software because there is no multicast
>> socket listening on that address.
>
>OK.
>
>> > Also, I think it's weird that we have one API to program address and a
>> > completely different API (via switchdev) to program address+VID pairs.
>> > Extending current API might make more sense.
>> >
>>
>> Do you mean ndo_set_rx_mode() and dev_mc_sync()? That is what Ivan
>> proposed doing not so long ago here:
>>
>> https://www.spinics.net/lists/netdev/msg537424.html
>>
>> but that is IMHO wasting storage space, because the kernel is
>> maintaining the address lists, and now also needs to gain knowledge
>> about the VID. With up to 4K - 2 VLAN interfaces per switch port, this
>> bloats the memory footprint, we arguably still need to maintain those
>> address lists anyway...

Yes wasting storage it's a factor, even it brings a lot of other
benefits. But compared to this it's more legal.

As alternative I've also proposed completely working model when vlan
identification is real dev responsibility, allowing to split address
space between vlans (for mc and uc). And better to add more commments
to the referencies I provided there.

>
>I didn't review Ivan's changes in details, but it makes much more sense
>to me to simply extend the current Rx filtering mechanism than to use a
>completely unrelated infrastructure.

True.

>
>>
>> The reason why I chose switchdev here is because:
>>
>> - this is mostly relevant for switch devices, not so much for NICs (it
>> seems), if it was, they would have solved the problem by now
>
>I don't see any use of switchdev APIs in the driver Ivan is patching.
>The cover letter doesn't indicate anything about it either.

Yes, it's not related to switched.

>
>> - this allows to have an unified path from the switch driver perspective
>> to program MDB addresses targeting the CPU/management port, no need to
>> have X different ways of doing the same operation
>
>But it's not the same thing. Allowing certain packets to ingress the
>device is not the same as having the device send them to the CPU. We
>have VLAN filters as well. Allowing VID X to ingress does not mean that
>we trap each packet with this VID to CPU.
Ivan Khoronzhuk Jan. 22, 2019, 11:39 a.m. UTC | #6
On Thu, Jan 17, 2019 at 11:12:24AM -0800, Florian Fainelli wrote:
>On 1/17/19 6:49 AM, Ido Schimmel wrote:
>> On Wed, Jan 16, 2019 at 12:00:58PM -0800, Florian Fainelli wrote:
>>> The VLAN real device could be an Ethernet switch port and that switch
>>> might have VLAN filtering globally enabled (because of a bridge
>>> requesting VLAN filtering on the switch on another port) and so when
>>> programming multicast addresses, we need the multicast filter
>>> programming to be aware of the correct VLAN ID as well.
>>
>> This looks like a quirk of a specific device. How bad is it to patch the
>> driver to add a multicast address for every configured VLAN?
>
>There is at least another driver that can be benefit from that which is
>cpsw, if I understand Ivan's use case correctly.
>
>If there is a ndo_set_rx_mode() function implemented by the virtual
>device, and that does call dev_{mc,uc}_sync(master, dev), then this
>means that you do want to be able to filter UC and MC addresses. If we
>added the entire class D range of multicast addresses to the switch's
>MDB, that would not be filtering, we would be passing up everything to
>the stack and let it filter in software because there is no multicast
>socket listening on that address.
>
>>
>> Also, I think it's weird that we have one API to program address and a
>> completely different API (via switchdev) to program address+VID pairs.
>> Extending current API might make more sense.
>>
>
>Do you mean ndo_set_rx_mode() and dev_mc_sync()? That is what Ivan
>proposed doing not so long ago here:
>
>https://www.spinics.net/lists/netdev/msg537424.html
>
>but that is IMHO wasting storage space, because the kernel is
>maintaining the address lists, and now also needs to gain knowledge
>about the VID. With up to 4K - 2 VLAN interfaces per switch port, this
>bloats the memory footprint, we arguably still need to maintain those
>address lists anyway...
>
>The reason why I chose switchdev here is because:
>
>- this is mostly relevant for switch devices, not so much for NICs (it
>seems), if it was, they would have solved the problem by now
If provide normal interface for this it could spread.
And personally my aim is to avoid redundant mcast traffic when i don't
expect it saving CPU performance, latency .... and overall it should
be done in this way, switchdev here seems like otth.

>
>- this allows to have an unified path from the switch driver perspective
>to program MDB addresses targeting the CPU/management port, no need to
>have X different ways of doing the same operation
>-- 
>Florian
diff mbox series

Patch

diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index b2d9c8f27cd7..ea2ef9d78dcb 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -312,6 +312,43 @@  static int vlan_dev_open(struct net_device *dev)
 	return err;
 }
 
+static int vlan_dev_sync_unsync_mc_addr(struct net_device *dev,
+                                        const unsigned char *addr,
+                                        bool add)
+{
+	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	struct switchdev_obj_port_mdb mdb = {
+		.obj = {
+			.orig_dev = dev,
+			.id = SWITCHDEV_OBJ_ID_HOST_MDB,
+			.flags = SWITCHDEV_F_DEFER,
+		},
+		.vid = vlan_dev_vlan_id(dev),
+	};
+	int ret = -EOPNOTSUPP;
+
+	ether_addr_copy(mdb.addr, addr);
+        if (add)
+		ret = switchdev_port_obj_add(real_dev, &mdb.obj, NULL);
+        else
+		ret = switchdev_port_obj_del(real_dev, &mdb.obj);
+
+	return ret;
+}
+
+static int vlan_dev_sync_mc_addr(struct net_device *dev,
+                                 const unsigned char *addr)
+{
+	return vlan_dev_sync_unsync_mc_addr(dev, addr, true);
+}
+
+static int vlan_dev_unsync_mc_addr(struct net_device *dev,
+                                   const unsigned char *addr)
+{
+	return vlan_dev_sync_unsync_mc_addr(dev, addr, false);
+}
+
+
 static int vlan_dev_stop(struct net_device *dev)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
@@ -319,6 +356,7 @@  static int vlan_dev_stop(struct net_device *dev)
 
 	dev_mc_unsync(real_dev, dev);
 	dev_uc_unsync(real_dev, dev);
+	__hw_addr_unsync_dev(&dev->mc, dev, vlan_dev_unsync_mc_addr);
 	if (dev->flags & IFF_ALLMULTI)
 		dev_set_allmulti(real_dev, -1);
 	if (dev->flags & IFF_PROMISC)
@@ -483,6 +521,8 @@  static void vlan_dev_change_rx_flags(struct net_device *dev, int change)
 
 static void vlan_dev_set_rx_mode(struct net_device *vlan_dev)
 {
+	__hw_addr_sync_dev(&vlan_dev->mc, vlan_dev, vlan_dev_sync_mc_addr,
+			   vlan_dev_unsync_mc_addr);
 	dev_mc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
 	dev_uc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
 }