Patchwork [RFC] core: Add ioctls to control device unicast hw addresses

login
register
mail settings
Submitter John Fastabend
Date Feb. 27, 2013, 2:07 a.m.
Message ID <20130227020744.GA25063@nitbit.x32>
Download mbox | patch
Permalink /patch/223462/
State RFC
Delegated to: David Miller
Headers show

Comments

John Fastabend - Feb. 27, 2013, 2:07 a.m.
On Tue, Feb 26, 2013 at 05:15:03PM -0500, David Miller wrote:
> From: John Fastabend <john.r.fastabend@intel.com>
> Date: Tue, 26 Feb 2013 13:58:39 -0800
> 
> > [...]
> > 
> >>>>
> >>>>
> >>>> [RFC PATCH] rtnetlink: Add support for adding/removing additional hw
> >>>> addresses.
> >>>>
> >>>> Add an ability to add and remove HW addresses to the device
> >>>> unicast and multicast address lists.  Right now, we only have
> >>>> an ioctl() to manage the multicast addresses and there is no
> >>>> way the manage the unicast list.
> >>>>
> >>>> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
> >>>
> >>> This is a step in the right direction, and you're right that there is
> >>> a difficulty in detecting whether support exists or not.
> >>>
> >>> I am so surprised that we've have ->set_rx_mode() support for multiple
> >>> unicast MAC addresses in so many drivers all this time, yet no way
> >>> outside of FDB to make use of it at all.
> >>
> >> And even that is not always available.  In most drivers it requires
> >> module parameters or other explicit configuration steps.  Meanwhile
> >> set_rx_mode() doesn't seem to depend on any of those and just does the
> >> right thing.
> >>
> >> For what I was trying to do ioctl() was a really easy way out for both
> >> kernel and user space implementation, so I gave is shot.
> >>
> >> -vlad
> >>
> > 
> > Don't we already support this with
> 
> The whole point is that these multiple-unicast-address configuration
> facilities are inaccessible without FDB, and there is no reason
> whatsoever for that.

Yes I see now sorry I was behind the thread.

We could just use the fdb hooks and add a dflt routine to handle the
case where the netdev doesn't provide a specific ndo_op for us. This
boils down to calling set_rx_mode anyways through this call chain,

   ndo_fdb_add
	dev_uc_add_excl
		__dev_set_rx_mode
			ndo_set_rx_mode

As long as folks don't think this is too much of an abuse of the fdb
hooks. Here's an example I only compile tested this so take it as
an example only. It probably needs some cleanups and the dump routine
needs some thought not sure you want to dump all MACs always.


[RFC PATCH] net: fdb: generic set support for drivers without ndo_op

If the driver does not support the ndo_op use the generic
handler for it. This should work in the majority of cases.
Eventually the fdb_dflt_add call gets translated into a
__dev_set_rx_mode() call which should handle hardware
support for filtering via the IFF_UNICAST_FLT flag.

Namely IFF_UNICAST_FLT indicates if the hardware can do
unicast address filtering. If no support is available
the device is put into promisc mode.

It looks like we likely also need IFF_MULTICAST_FLT and
also tighten up how drivers handle multicast filters. But
thats another patch.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 net/core/rtnetlink.c |   80 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 74 insertions(+), 6 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vlad Yasevich - Feb. 27, 2013, 2:27 a.m.
On 02/26/2013 09:07 PM, John Fastabend wrote:
> On Tue, Feb 26, 2013 at 05:15:03PM -0500, David Miller wrote:
>> From: John Fastabend <john.r.fastabend@intel.com>
>> Date: Tue, 26 Feb 2013 13:58:39 -0800
>>
>>> [...]
>>>
>>>>>>
>>>>>>
>>>>>> [RFC PATCH] rtnetlink: Add support for adding/removing additional hw
>>>>>> addresses.
>>>>>>
>>>>>> Add an ability to add and remove HW addresses to the device
>>>>>> unicast and multicast address lists.  Right now, we only have
>>>>>> an ioctl() to manage the multicast addresses and there is no
>>>>>> way the manage the unicast list.
>>>>>>
>>>>>> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>>>>>
>>>>> This is a step in the right direction, and you're right that there is
>>>>> a difficulty in detecting whether support exists or not.
>>>>>
>>>>> I am so surprised that we've have ->set_rx_mode() support for multiple
>>>>> unicast MAC addresses in so many drivers all this time, yet no way
>>>>> outside of FDB to make use of it at all.
>>>>
>>>> And even that is not always available.  In most drivers it requires
>>>> module parameters or other explicit configuration steps.  Meanwhile
>>>> set_rx_mode() doesn't seem to depend on any of those and just does the
>>>> right thing.
>>>>
>>>> For what I was trying to do ioctl() was a really easy way out for both
>>>> kernel and user space implementation, so I gave is shot.
>>>>
>>>> -vlad
>>>>
>>>
>>> Don't we already support this with
>>
>> The whole point is that these multiple-unicast-address configuration
>> facilities are inaccessible without FDB, and there is no reason
>> whatsoever for that.
>
> Yes I see now sorry I was behind the thread.
>
> We could just use the fdb hooks and add a dflt routine to handle the
> case where the netdev doesn't provide a specific ndo_op for us. This
> boils down to calling set_rx_mode anyways through this call chain,
>
>     ndo_fdb_add
> 	dev_uc_add_excl
> 		__dev_set_rx_mode
> 			ndo_set_rx_mode
>
> As long as folks don't think this is too much of an abuse of the fdb
> hooks. Here's an example I only compile tested this so take it as
> an example only. It probably needs some cleanups and the dump routine
> needs some thought not sure you want to dump all MACs always.

I thought about doing, but this becomes broken on the drivers that 
support ndo_fdb_* functions.  Those drivers mainly require additional 
configuration that is not necessary for dev_uc_add_excl() to work.

For example, ixgbe and melanox requires VF to be on, qlogic needs a 
module parameter, macvlan has to be in passthrough.  However, 
ndo_set_rx_mode() in most cases doesn't care about those settings and 
just works.

So fdb will not always work even if the driver has proper support for
IFF_UNICAST_FLT.

-vlad

>
>
> [RFC PATCH] net: fdb: generic set support for drivers without ndo_op
>
> If the driver does not support the ndo_op use the generic
> handler for it. This should work in the majority of cases.
> Eventually the fdb_dflt_add call gets translated into a
> __dev_set_rx_mode() call which should handle hardware
> support for filtering via the IFF_UNICAST_FLT flag.
>
> Namely IFF_UNICAST_FLT indicates if the hardware can do
> unicast address filtering. If no support is available
> the device is put into promisc mode.
>
> It looks like we likely also need IFF_MULTICAST_FLT and
> also tighten up how drivers handle multicast filters. But
> thats another patch.
>
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> ---
>   net/core/rtnetlink.c |   80 ++++++++++++++++++++++++++++++++++++++++++++++----
>   1 file changed, 74 insertions(+), 6 deletions(-)
>
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index d8aa20f..d1c6f71 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -2049,6 +2049,37 @@ errout:
>   	rtnl_set_sk_err(net, RTNLGRP_NEIGH, err);
>   }
>
> +/**
> + * ndo_dflt_fdb_add - default netdevice operation to add an FDB entry
> + */
> +static int ndo_dflt_fdb_add(struct ndmsg *ndm,
> +			    struct nlattr *tb[],
> +			    struct net_device *dev,
> +			    const unsigned char *addr,
> +			    u16 flags)
> +{
> +	int err = -EINVAL;
> +
> +	/* If aging addresses are supported device will need to
> +	 * implement its own handler for this.
> +	 */
> +	if (ndm->ndm_state && !(ndm->ndm_state & NUD_PERMANENT)) {
> +		pr_info("%s: FDB only supports static addresses\n", dev->name);
> +		return err;
> +	}
> +
> +	if (is_unicast_ether_addr(addr) || is_link_local_ether_addr(addr))
> +		err = dev_uc_add_excl(dev, addr);
> +	else if (is_multicast_ether_addr(addr))
> +		err = dev_mc_add_excl(dev, addr);
> +
> +	/* Only return duplicate errors if NLM_F_EXCL is set */
> +	if (err == -EEXIST && !(flags & NLM_F_EXCL))
> +		err = 0;
> +
> +	return err;
> +}
> +
>   static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
>   {
>   	struct net *net = sock_net(skb->sk);
> @@ -2101,10 +2132,14 @@ static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
>   	}
>
>   	/* Embedded bridge, macvlan, and any other device support */
> -	if ((ndm->ndm_flags & NTF_SELF) && dev->netdev_ops->ndo_fdb_add) {
> -		err = dev->netdev_ops->ndo_fdb_add(ndm, tb,
> -						   dev, addr,
> -						   nlh->nlmsg_flags);
> +	if ((ndm->ndm_flags & NTF_SELF)) {
> +		if (dev->netdev_ops->ndo_fdb_add)
> +			err = dev->netdev_ops->ndo_fdb_add(ndm, tb,
> +						   	   dev, addr,
> +						   	   nlh->nlmsg_flags);
> +		else
> +			err = ndo_dflt_fdb_add(ndm, tb, dev, addr,
> +					       nlh->nlmsg_flags);
>
>   		if (!err) {
>   			rtnl_fdb_notify(dev, addr, RTM_NEWNEIGH);
> @@ -2115,6 +2150,34 @@ out:
>   	return err;
>   }
>
> +/**
> + * ndo_dflt_fdb_del - default netdevice operation to delete an FDB entry
> + */
> +static int ndo_dflt_fdb_del(struct ndmsg *ndm,
> +			    struct nlattr *tb[],
> +			    struct net_device *dev,
> +			    const unsigned char *addr)
> +{
> +	int err = -EOPNOTSUPP;
> +
> +	/* If aging addresses are supported device will need to
> +	 * implement its own handler for this.
> +	 */
> +	if (ndm->ndm_state & NUD_PERMANENT) {
> +		pr_info("%s: FDB only supports static addresses\n", dev->name);
> +		return -EINVAL;
> +	}
> +
> +	if (is_unicast_ether_addr(addr))
> +		err = dev_uc_del(dev, addr);
> +	else if (is_multicast_ether_addr(addr))
> +		err = dev_mc_del(dev, addr);
> +	else
> +		err = -EINVAL;
> +
> +	return err;
> +}
> +
>   static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
>   {
>   	struct net *net = sock_net(skb->sk);
> @@ -2172,8 +2235,11 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
>   	}
>
>   	/* Embedded bridge, macvlan, and any other device support */
> -	if ((ndm->ndm_flags & NTF_SELF) && dev->netdev_ops->ndo_fdb_del) {
> -		err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr);
> +	if (ndm->ndm_flags & NTF_SELF) {
> +		if (dev->netdev_ops->ndo_fdb_del)
> +			err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr);
> +		else
> +			err = ndo_dflt_fdb_del(ndm, tb, dev, addr);
>
>   		if (!err) {
>   			rtnl_fdb_notify(dev, addr, RTM_DELNEIGH);
> @@ -2258,6 +2324,8 @@ static int rtnl_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
>
>   		if (dev->netdev_ops->ndo_fdb_dump)
>   			idx = dev->netdev_ops->ndo_fdb_dump(skb, cb, dev, idx);
> +		else
> +			ndo_dflt_fdb_dump(skb, cb, dev, idx);
>   	}
>   	rcu_read_unlock();
>
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend - Feb. 27, 2013, 5:01 a.m.
On 2/26/2013 6:27 PM, Vlad Yasevich wrote:
> On 02/26/2013 09:07 PM, John Fastabend wrote:
>> On Tue, Feb 26, 2013 at 05:15:03PM -0500, David Miller wrote:
>>> From: John Fastabend <john.r.fastabend@intel.com>
>>> Date: Tue, 26 Feb 2013 13:58:39 -0800
>>>
>>>> [...]
>>>>
>>>>>>>
>>>>>>>
>>>>>>> [RFC PATCH] rtnetlink: Add support for adding/removing additional hw
>>>>>>> addresses.
>>>>>>>
>>>>>>> Add an ability to add and remove HW addresses to the device
>>>>>>> unicast and multicast address lists.  Right now, we only have
>>>>>>> an ioctl() to manage the multicast addresses and there is no
>>>>>>> way the manage the unicast list.
>>>>>>>
>>>>>>> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>>>>>>
>>>>>> This is a step in the right direction, and you're right that there is
>>>>>> a difficulty in detecting whether support exists or not.
>>>>>>
>>>>>> I am so surprised that we've have ->set_rx_mode() support for
>>>>>> multiple
>>>>>> unicast MAC addresses in so many drivers all this time, yet no way
>>>>>> outside of FDB to make use of it at all.
>>>>>
>>>>> And even that is not always available.  In most drivers it requires
>>>>> module parameters or other explicit configuration steps.  Meanwhile
>>>>> set_rx_mode() doesn't seem to depend on any of those and just does the
>>>>> right thing.
>>>>>
>>>>> For what I was trying to do ioctl() was a really easy way out for both
>>>>> kernel and user space implementation, so I gave is shot.
>>>>>
>>>>> -vlad
>>>>>
>>>>
>>>> Don't we already support this with
>>>
>>> The whole point is that these multiple-unicast-address configuration
>>> facilities are inaccessible without FDB, and there is no reason
>>> whatsoever for that.
>>
>> Yes I see now sorry I was behind the thread.
>>
>> We could just use the fdb hooks and add a dflt routine to handle the
>> case where the netdev doesn't provide a specific ndo_op for us. This
>> boils down to calling set_rx_mode anyways through this call chain,
>>
>>     ndo_fdb_add
>>     dev_uc_add_excl
>>         __dev_set_rx_mode
>>             ndo_set_rx_mode
>>
>> As long as folks don't think this is too much of an abuse of the fdb
>> hooks. Here's an example I only compile tested this so take it as
>> an example only. It probably needs some cleanups and the dump routine
>> needs some thought not sure you want to dump all MACs always.
>
> I thought about doing, but this becomes broken on the drivers that
> support ndo_fdb_* functions.  Those drivers mainly require additional
> configuration that is not necessary for dev_uc_add_excl() to work.
>
> For example, ixgbe and melanox requires VF to be on, qlogic needs a
> module parameter, macvlan has to be in passthrough.  However,
> ndo_set_rx_mode() in most cases doesn't care about those settings and
> just works.
>

The ixgbe piece could just use the dflt routines I put below. The
SR-IOV check is only there because when I wrote that I didn't consider
adding additional addresses without VFs. This was a poor example.

The mlx4 card is checking if the device is a slave or master before
adding/removing addresses. My guess is this is the same type of check
as ixgbe and would work just fine without it.

I'm not sure macvlan supports unicast hw addresses either way but
there is no reason we couldn't. The idea was we would come back and
write it later. Like many things I haven't got to it yet.

Calling set_rx_mode() on qlcnic doesn't look to help you at all either
it appears to ignore the uc_list either way.

is that all the callers? looks like it.

> So fdb will not always work even if the driver has proper support for
> IFF_UNICAST_FLT.
>

The fdb could work if we fix the drivers it seems to me most
the cases where it wouldn't work are just lack of foresight or the code
isn't there yet. I think it might be valuable to have a single API for
both use cases.

.John

> -vlad
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jitendra Kalsaria - Feb. 27, 2013, 5:42 a.m.
On 2/26/13 9:01 PM, "John Fastabend" <john.r.fastabend@intel.com> wrote:

>On 2/26/2013 6:27 PM, Vlad Yasevich wrote:
>> On 02/26/2013 09:07 PM, John Fastabend wrote:
>>> On Tue, Feb 26, 2013 at 05:15:03PM -0500, David Miller wrote:
>>>> From: John Fastabend <john.r.fastabend@intel.com>
>>>> Date: Tue, 26 Feb 2013 13:58:39 -0800
>>>>
>>>>> [...]
>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> [RFC PATCH] rtnetlink: Add support for adding/removing additional
>>>>>>>>hw
>>>>>>>> addresses.
>>>>>>>>
>>>>>>>> Add an ability to add and remove HW addresses to the device
>>>>>>>> unicast and multicast address lists.  Right now, we only have
>>>>>>>> an ioctl() to manage the multicast addresses and there is no
>>>>>>>> way the manage the unicast list.
>>>>>>>>
>>>>>>>> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>>>>>>>
>>>>>>> This is a step in the right direction, and you're right that there
>>>>>>>is
>>>>>>> a difficulty in detecting whether support exists or not.
>>>>>>>
>>>>>>> I am so surprised that we've have ->set_rx_mode() support for
>>>>>>> multiple
>>>>>>> unicast MAC addresses in so many drivers all this time, yet no way
>>>>>>> outside of FDB to make use of it at all.
>>>>>>
>>>>>> And even that is not always available.  In most drivers it requires
>>>>>> module parameters or other explicit configuration steps.  Meanwhile
>>>>>> set_rx_mode() doesn't seem to depend on any of those and just does
>>>>>>the
>>>>>> right thing.
>>>>>>
>>>>>> For what I was trying to do ioctl() was a really easy way out for
>>>>>>both
>>>>>> kernel and user space implementation, so I gave is shot.
>>>>>>
>>>>>> -vlad
>>>>>>
>>>>>
>>>>> Don't we already support this with
>>>>
>>>> The whole point is that these multiple-unicast-address configuration
>>>> facilities are inaccessible without FDB, and there is no reason
>>>> whatsoever for that.
>>>
>>> Yes I see now sorry I was behind the thread.
>>>
>>> We could just use the fdb hooks and add a dflt routine to handle the
>>> case where the netdev doesn't provide a specific ndo_op for us. This
>>> boils down to calling set_rx_mode anyways through this call chain,
>>>
>>>     ndo_fdb_add
>>>     dev_uc_add_excl
>>>         __dev_set_rx_mode
>>>             ndo_set_rx_mode
>>>
>>> As long as folks don't think this is too much of an abuse of the fdb
>>> hooks. Here's an example I only compile tested this so take it as
>>> an example only. It probably needs some cleanups and the dump routine
>>> needs some thought not sure you want to dump all MACs always.
>>
>> I thought about doing, but this becomes broken on the drivers that
>> support ndo_fdb_* functions.  Those drivers mainly require additional
>> configuration that is not necessary for dev_uc_add_excl() to work.
>>
>> For example, ixgbe and melanox requires VF to be on, qlogic needs a
>> module parameter, macvlan has to be in passthrough.  However,
>> ndo_set_rx_mode() in most cases doesn't care about those settings and
>> just works.
>>
>
>The ixgbe piece could just use the dflt routines I put below. The
>SR-IOV check is only there because when I wrote that I didn't consider
>adding additional addresses without VFs. This was a poor example.
>
>The mlx4 card is checking if the device is a slave or master before
>adding/removing addresses. My guess is this is the same type of check
>as ixgbe and would work just fine without it.
>
>I'm not sure macvlan supports unicast hw addresses either way but
>there is no reason we couldn't. The idea was we would come back and
>write it later. Like many things I haven't got to it yet.
>
>Calling set_rx_mode() on qlcnic doesn't look to help you at all either
>it appears to ignore the uc_list either way.

Even though we use module parameter for fdb but it would be good it have
set_rx_mode().
We don't ignore uc_list but it never called set_rx_mode() when I use
dev_uc_add_excl when added support and thought that's how it might
designed.
  
-Jiten
>
>is that all the callers? looks like it.
>
>> So fdb will not always work even if the driver has proper support for
>> IFF_UNICAST_FLT.
>>
>
>The fdb could work if we fix the drivers it seems to me most
>the cases where it wouldn't work are just lack of foresight or the code
>isn't there yet. I think it might be valuable to have a single API for
>both use cases.
>
>.John
>
>> -vlad
>>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend - Feb. 27, 2013, 7:34 a.m.
On 02/26/2013 09:42 PM, Jitendra Kalsaria wrote:
>
>
> On 2/26/13 9:01 PM, "John Fastabend" <john.r.fastabend@intel.com> wrote:
>
>> On 2/26/2013 6:27 PM, Vlad Yasevich wrote:
>>> On 02/26/2013 09:07 PM, John Fastabend wrote:
>>>> On Tue, Feb 26, 2013 at 05:15:03PM -0500, David Miller wrote:
>>>>> From: John Fastabend <john.r.fastabend@intel.com>
>>>>> Date: Tue, 26 Feb 2013 13:58:39 -0800
>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [RFC PATCH] rtnetlink: Add support for adding/removing additional
>>>>>>>>> hw
>>>>>>>>> addresses.
>>>>>>>>>
>>>>>>>>> Add an ability to add and remove HW addresses to the device
>>>>>>>>> unicast and multicast address lists.  Right now, we only have
>>>>>>>>> an ioctl() to manage the multicast addresses and there is no
>>>>>>>>> way the manage the unicast list.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>>>>>>>>
>>>>>>>> This is a step in the right direction, and you're right that there
>>>>>>>> is
>>>>>>>> a difficulty in detecting whether support exists or not.
>>>>>>>>
>>>>>>>> I am so surprised that we've have ->set_rx_mode() support for
>>>>>>>> multiple
>>>>>>>> unicast MAC addresses in so many drivers all this time, yet no way
>>>>>>>> outside of FDB to make use of it at all.
>>>>>>>
>>>>>>> And even that is not always available.  In most drivers it requires
>>>>>>> module parameters or other explicit configuration steps.  Meanwhile
>>>>>>> set_rx_mode() doesn't seem to depend on any of those and just does
>>>>>>> the
>>>>>>> right thing.
>>>>>>>
>>>>>>> For what I was trying to do ioctl() was a really easy way out for
>>>>>>> both
>>>>>>> kernel and user space implementation, so I gave is shot.
>>>>>>>
>>>>>>> -vlad
>>>>>>>
>>>>>>
>>>>>> Don't we already support this with
>>>>>
>>>>> The whole point is that these multiple-unicast-address configuration
>>>>> facilities are inaccessible without FDB, and there is no reason
>>>>> whatsoever for that.
>>>>
>>>> Yes I see now sorry I was behind the thread.
>>>>
>>>> We could just use the fdb hooks and add a dflt routine to handle the
>>>> case where the netdev doesn't provide a specific ndo_op for us. This
>>>> boils down to calling set_rx_mode anyways through this call chain,
>>>>
>>>>      ndo_fdb_add
>>>>      dev_uc_add_excl
>>>>          __dev_set_rx_mode
>>>>              ndo_set_rx_mode
>>>>
>>>> As long as folks don't think this is too much of an abuse of the fdb
>>>> hooks. Here's an example I only compile tested this so take it as
>>>> an example only. It probably needs some cleanups and the dump routine
>>>> needs some thought not sure you want to dump all MACs always.
>>>
>>> I thought about doing, but this becomes broken on the drivers that
>>> support ndo_fdb_* functions.  Those drivers mainly require additional
>>> configuration that is not necessary for dev_uc_add_excl() to work.
>>>
>>> For example, ixgbe and melanox requires VF to be on, qlogic needs a
>>> module parameter, macvlan has to be in passthrough.  However,
>>> ndo_set_rx_mode() in most cases doesn't care about those settings and
>>> just works.
>>>
>>
>> The ixgbe piece could just use the dflt routines I put below. The
>> SR-IOV check is only there because when I wrote that I didn't consider
>> adding additional addresses without VFs. This was a poor example.
>>
>> The mlx4 card is checking if the device is a slave or master before
>> adding/removing addresses. My guess is this is the same type of check
>> as ixgbe and would work just fine without it.
>>
>> I'm not sure macvlan supports unicast hw addresses either way but
>> there is no reason we couldn't. The idea was we would come back and
>> write it later. Like many things I haven't got to it yet.
>>
>> Calling set_rx_mode() on qlcnic doesn't look to help you at all either
>> it appears to ignore the uc_list either way.
>
> Even though we use module parameter for fdb but it would be good it have
> set_rx_mode().
> We don't ignore uc_list but it never called set_rx_mode() when I use
> dev_uc_add_excl when added support and thought that's how it might
> designed.
>
> -Jiten
>>

Hopefully not too off topic but you probably should use dev_uc_add_excl
in fdb_add and then fix your set_rx_mode() handler to configure the
uc_list correctly. Not sure what happens today when a unicast addr
is added to the list for example with macvlan looks like a bug to me.


.John
Vlad Yasevich - Feb. 27, 2013, 3:27 p.m.
On 02/27/2013 12:42 AM, Jitendra Kalsaria wrote:
>
>
> On 2/26/13 9:01 PM, "John Fastabend" <john.r.fastabend@intel.com> wrote:
>
>> On 2/26/2013 6:27 PM, Vlad Yasevich wrote:
>>> On 02/26/2013 09:07 PM, John Fastabend wrote:
>>>> On Tue, Feb 26, 2013 at 05:15:03PM -0500, David Miller wrote:
>>>>> From: John Fastabend <john.r.fastabend@intel.com>
>>>>> Date: Tue, 26 Feb 2013 13:58:39 -0800
>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [RFC PATCH] rtnetlink: Add support for adding/removing additional
>>>>>>>>> hw
>>>>>>>>> addresses.
>>>>>>>>>
>>>>>>>>> Add an ability to add and remove HW addresses to the device
>>>>>>>>> unicast and multicast address lists.  Right now, we only have
>>>>>>>>> an ioctl() to manage the multicast addresses and there is no
>>>>>>>>> way the manage the unicast list.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>>>>>>>>
>>>>>>>> This is a step in the right direction, and you're right that there
>>>>>>>> is
>>>>>>>> a difficulty in detecting whether support exists or not.
>>>>>>>>
>>>>>>>> I am so surprised that we've have ->set_rx_mode() support for
>>>>>>>> multiple
>>>>>>>> unicast MAC addresses in so many drivers all this time, yet no way
>>>>>>>> outside of FDB to make use of it at all.
>>>>>>>
>>>>>>> And even that is not always available.  In most drivers it requires
>>>>>>> module parameters or other explicit configuration steps.  Meanwhile
>>>>>>> set_rx_mode() doesn't seem to depend on any of those and just does
>>>>>>> the
>>>>>>> right thing.
>>>>>>>
>>>>>>> For what I was trying to do ioctl() was a really easy way out for
>>>>>>> both
>>>>>>> kernel and user space implementation, so I gave is shot.
>>>>>>>
>>>>>>> -vlad
>>>>>>>
>>>>>>
>>>>>> Don't we already support this with
>>>>>
>>>>> The whole point is that these multiple-unicast-address configuration
>>>>> facilities are inaccessible without FDB, and there is no reason
>>>>> whatsoever for that.
>>>>
>>>> Yes I see now sorry I was behind the thread.
>>>>
>>>> We could just use the fdb hooks and add a dflt routine to handle the
>>>> case where the netdev doesn't provide a specific ndo_op for us. This
>>>> boils down to calling set_rx_mode anyways through this call chain,
>>>>
>>>>      ndo_fdb_add
>>>>      dev_uc_add_excl
>>>>          __dev_set_rx_mode
>>>>              ndo_set_rx_mode
>>>>
>>>> As long as folks don't think this is too much of an abuse of the fdb
>>>> hooks. Here's an example I only compile tested this so take it as
>>>> an example only. It probably needs some cleanups and the dump routine
>>>> needs some thought not sure you want to dump all MACs always.
>>>
>>> I thought about doing, but this becomes broken on the drivers that
>>> support ndo_fdb_* functions.  Those drivers mainly require additional
>>> configuration that is not necessary for dev_uc_add_excl() to work.
>>>
>>> For example, ixgbe and melanox requires VF to be on, qlogic needs a
>>> module parameter, macvlan has to be in passthrough.  However,
>>> ndo_set_rx_mode() in most cases doesn't care about those settings and
>>> just works.
>>>
>>
>> The ixgbe piece could just use the dflt routines I put below. The
>> SR-IOV check is only there because when I wrote that I didn't consider
>> adding additional addresses without VFs. This was a poor example.
>>
>> The mlx4 card is checking if the device is a slave or master before
>> adding/removing addresses. My guess is this is the same type of check
>> as ixgbe and would work just fine without it.
>>
>> I'm not sure macvlan supports unicast hw addresses either way but
>> there is no reason we couldn't. The idea was we would come back and
>> write it later. Like many things I haven't got to it yet.
>>
>> Calling set_rx_mode() on qlcnic doesn't look to help you at all either
>> it appears to ignore the uc_list either way.
>
> Even though we use module parameter for fdb but it would be good it have
> set_rx_mode().
> We don't ignore uc_list but it never called set_rx_mode() when I use
> dev_uc_add_excl when added support and thought that's how it might
> designed.

I doesn't look like you ever call dev_uc_add_excl().  Also, since the
driver doesn't set IFF_UNICAST_FLT, anyone who does call that (macvlan
for instance) will end up turning promisc on.

-vlad


>
> -Jiten
>>
>> is that all the callers? looks like it.
>>
>>> So fdb will not always work even if the driver has proper support for
>>> IFF_UNICAST_FLT.
>>>
>>
>> The fdb could work if we fix the drivers it seems to me most
>> the cases where it wouldn't work are just lack of foresight or the code
>> isn't there yet. I think it might be valuable to have a single API for
>> both use cases.
>>
>> .John
>>
>>> -vlad
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index d8aa20f..d1c6f71 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2049,6 +2049,37 @@  errout:
 	rtnl_set_sk_err(net, RTNLGRP_NEIGH, err);
 }
 
+/**
+ * ndo_dflt_fdb_add - default netdevice operation to add an FDB entry
+ */
+static int ndo_dflt_fdb_add(struct ndmsg *ndm,
+			    struct nlattr *tb[],
+			    struct net_device *dev,
+			    const unsigned char *addr,
+			    u16 flags)
+{
+	int err = -EINVAL;
+
+	/* If aging addresses are supported device will need to
+	 * implement its own handler for this.
+	 */
+	if (ndm->ndm_state && !(ndm->ndm_state & NUD_PERMANENT)) {
+		pr_info("%s: FDB only supports static addresses\n", dev->name);
+		return err;
+	}
+
+	if (is_unicast_ether_addr(addr) || is_link_local_ether_addr(addr))
+		err = dev_uc_add_excl(dev, addr);
+	else if (is_multicast_ether_addr(addr))
+		err = dev_mc_add_excl(dev, addr);
+
+	/* Only return duplicate errors if NLM_F_EXCL is set */
+	if (err == -EEXIST && !(flags & NLM_F_EXCL))
+		err = 0;
+
+	return err;
+}
+
 static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 {
 	struct net *net = sock_net(skb->sk);
@@ -2101,10 +2132,14 @@  static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 	}
 
 	/* Embedded bridge, macvlan, and any other device support */
-	if ((ndm->ndm_flags & NTF_SELF) && dev->netdev_ops->ndo_fdb_add) {
-		err = dev->netdev_ops->ndo_fdb_add(ndm, tb,
-						   dev, addr,
-						   nlh->nlmsg_flags);
+	if ((ndm->ndm_flags & NTF_SELF)) {
+		if (dev->netdev_ops->ndo_fdb_add)
+			err = dev->netdev_ops->ndo_fdb_add(ndm, tb,
+						   	   dev, addr,
+						   	   nlh->nlmsg_flags);
+		else
+			err = ndo_dflt_fdb_add(ndm, tb, dev, addr,
+					       nlh->nlmsg_flags);
 
 		if (!err) {
 			rtnl_fdb_notify(dev, addr, RTM_NEWNEIGH);
@@ -2115,6 +2150,34 @@  out:
 	return err;
 }
 
+/**
+ * ndo_dflt_fdb_del - default netdevice operation to delete an FDB entry
+ */
+static int ndo_dflt_fdb_del(struct ndmsg *ndm,
+			    struct nlattr *tb[],
+			    struct net_device *dev,
+			    const unsigned char *addr)
+{
+	int err = -EOPNOTSUPP;
+
+	/* If aging addresses are supported device will need to
+	 * implement its own handler for this.
+	 */
+	if (ndm->ndm_state & NUD_PERMANENT) {
+		pr_info("%s: FDB only supports static addresses\n", dev->name);
+		return -EINVAL;
+	}
+
+	if (is_unicast_ether_addr(addr))
+		err = dev_uc_del(dev, addr);
+	else if (is_multicast_ether_addr(addr))
+		err = dev_mc_del(dev, addr);
+	else
+		err = -EINVAL;
+
+	return err;
+}
+
 static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 {
 	struct net *net = sock_net(skb->sk);
@@ -2172,8 +2235,11 @@  static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 	}
 
 	/* Embedded bridge, macvlan, and any other device support */
-	if ((ndm->ndm_flags & NTF_SELF) && dev->netdev_ops->ndo_fdb_del) {
-		err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr);
+	if (ndm->ndm_flags & NTF_SELF) {
+		if (dev->netdev_ops->ndo_fdb_del)
+			err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr);
+		else
+			err = ndo_dflt_fdb_del(ndm, tb, dev, addr);
 
 		if (!err) {
 			rtnl_fdb_notify(dev, addr, RTM_DELNEIGH);
@@ -2258,6 +2324,8 @@  static int rtnl_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
 
 		if (dev->netdev_ops->ndo_fdb_dump)
 			idx = dev->netdev_ops->ndo_fdb_dump(skb, cb, dev, idx);
+		else
+			ndo_dflt_fdb_dump(skb, cb, dev, idx);
 	}
 	rcu_read_unlock();