diff mbox

[2/3] bridge: offload bridge port attributes to switch asic if feature flag set

Message ID 1417746401-8140-3-git-send-email-roopa@cumulusnetworks.com
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Roopa Prabhu Dec. 5, 2014, 2:26 a.m. UTC
From: Roopa Prabhu <roopa@cumulusnetworks.com>

This allows offloading to switch asic without having the user to set
any flag. And this is done in the bridge driver to rollback kernel settings
on hw offload failure if required in the future.

With this, it also makes sure a notification goes out only after the
attributes are set both in the kernel and hw.
---
 net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

Comments

Scott Feldman Dec. 5, 2014, 6:41 a.m. UTC | #1
On Thu, Dec 4, 2014 at 6:26 PM,  <roopa@cumulusnetworks.com> wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> This allows offloading to switch asic without having the user to set
> any flag. And this is done in the bridge driver to rollback kernel settings
> on hw offload failure if required in the future.
>
> With this, it also makes sure a notification goes out only after the
> attributes are set both in the kernel and hw.

I like this approach as it streamlines the steps for the user in
setting port flags.  There is one case for FLOODING where you'll have
to turn off flooding for both, and then turn on flooding in hw.  You
don't want flooding turned on on kernel and hw.

> ---
>  net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 9f5eb55..ce173f0 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
>                                 afspec, RTM_SETLINK);
>         }
>
> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
> +                       dev->netdev_ops->ndo_bridge_setlink) {
> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev, nlh);

I think you want to up-level this to net/core/rtnetlink.c because
you're only enabling the feature for one instance of a driver that
implements ndo_bridge_setlink: the bridge driver.  If another driver
was MASTER and implemented ndo_bridge_setlink, you'd want same check
to push setting down to SELF port driver.

> +               if (ret && ret != -EOPNOTSUPP) {
> +                       /* XXX Fix this in the future to rollback
> +                        * kernel settings and return error
> +                        */

The future is now.  Let's fix this now for the rollback case (again up
in rtnetlink.c).  So then a general question comes to mind: for these
dual target sets, is it best to try HW first and then SW, or the other
way around?  Either way, on failure on second you need to rollback
first.  And, on failure, you need to know rollback value for first, so
you have to do a getlink on first before attempting set.

> +                       br_warn(p->br, "error offloading bridge attributes "
> +                                       "on port %u(%s)\n", (unsigned int) p->port_no,
> +                                       p->dev->name);
> +               }
> +       }
> +
>         if (err == 0)
>                 br_ifinfo_notify(RTM_NEWLINK, p);
> -
>  out:
>         return err;
>  }
> @@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct nlmsghdr *nlh)
>         err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
>                         afspec, RTM_DELLINK);
>
> +       if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
> +                       && dev->netdev_ops->ndo_bridge_setlink) {
> +               int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);
> +               if (ret && ret != -EOPNOTSUPP) {
> +                       /* XXX Fix this in the future to rollback
> +                        * kernel settings and return error
> +                        */
> +                       br_warn(p->br, "error offloading bridge attributes "
> +                                       "on port %u(%s)\n", (unsigned int) p->port_no,
> +                                       p->dev->name);
> +               }
> +       }
> +

Same comments as setlink above.

>         return err;
>  }
>  static int br_validate(struct nlattr *tb[], struct nlattr *data[])
> --
> 1.7.10.4
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend Dec. 5, 2014, 6:55 a.m. UTC | #2
On 12/04/2014 10:41 PM, Scott Feldman wrote:
> On Thu, Dec 4, 2014 at 6:26 PM,  <roopa@cumulusnetworks.com> wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> This allows offloading to switch asic without having the user to set
>> any flag. And this is done in the bridge driver to rollback kernel settings
>> on hw offload failure if required in the future.
>>
>> With this, it also makes sure a notification goes out only after the
>> attributes are set both in the kernel and hw.
>
> I like this approach as it streamlines the steps for the user in
> setting port flags.  There is one case for FLOODING where you'll have
> to turn off flooding for both, and then turn on flooding in hw.  You
> don't want flooding turned on on kernel and hw.
>
>> ---
>>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>>   1 file changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index 9f5eb55..ce173f0 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
>>                                  afspec, RTM_SETLINK);
>>          }
>>
>> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>> +                       dev->netdev_ops->ndo_bridge_setlink) {
>> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev, nlh);
>
> I think you want to up-level this to net/core/rtnetlink.c because
> you're only enabling the feature for one instance of a driver that
> implements ndo_bridge_setlink: the bridge driver.  If another driver
> was MASTER and implemented ndo_bridge_setlink, you'd want same check
> to push setting down to SELF port driver.

Also if the user set SELF && MASTER flags && had HW_SWITCH_OFFLOAD bit
set we would call ndo_bridge_setlink twice on the dev. Maybe you can
catch this case in rtnetlink.c and only call it once.

>
>> +               if (ret && ret != -EOPNOTSUPP) {
>> +                       /* XXX Fix this in the future to rollback
>> +                        * kernel settings and return error
>> +                        */
>
> The future is now.  Let's fix this now for the rollback case (again up
> in rtnetlink.c).  So then a general question comes to mind: for these
> dual target sets, is it best to try HW first and then SW, or the other
> way around?  Either way, on failure on second you need to rollback
> first.  And, on failure, you need to know rollback value for first, so
> you have to do a getlink on first before attempting set.

It might be helpful to return some indication of what object failed as
well.

>
>> +                       br_warn(p->br, "error offloading bridge attributes "
>> +                                       "on port %u(%s)\n", (unsigned int) p->port_no,
>> +                                       p->dev->name);
>> +               }
>> +       }
>> +
>>          if (err == 0)
>>                  br_ifinfo_notify(RTM_NEWLINK, p);
>> -
>>   out:
>>          return err;
>>   }
>> @@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct nlmsghdr *nlh)
>>          err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
>>                          afspec, RTM_DELLINK);
>>
>> +       if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
>> +                       && dev->netdev_ops->ndo_bridge_setlink) {
>> +               int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);
>> +               if (ret && ret != -EOPNOTSUPP) {
>> +                       /* XXX Fix this in the future to rollback
>> +                        * kernel settings and return error
>> +                        */
>> +                       br_warn(p->br, "error offloading bridge attributes "
>> +                                       "on port %u(%s)\n", (unsigned int) p->port_no,
>> +                                       p->dev->name);
>> +               }
>> +       }
>> +
>
> Same comments as setlink above.
>
>>          return err;
>>   }
>>   static int br_validate(struct nlattr *tb[], struct nlattr *data[])
>> --
>> 1.7.10.4
>>
Roopa Prabhu Dec. 5, 2014, 7:02 a.m. UTC | #3
On 12/4/14, 10:41 PM, Scott Feldman wrote:
> On Thu, Dec 4, 2014 at 6:26 PM,  <roopa@cumulusnetworks.com> wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> This allows offloading to switch asic without having the user to set
>> any flag. And this is done in the bridge driver to rollback kernel settings
>> on hw offload failure if required in the future.
>>
>> With this, it also makes sure a notification goes out only after the
>> attributes are set both in the kernel and hw.
> I like this approach as it streamlines the steps for the user in
> setting port flags.  There is one case for FLOODING where you'll have
> to turn off flooding for both, and then turn on flooding in hw.  You
> don't want flooding turned on on kernel and hw.
ok, maybe using the higher bits as in 
https://patchwork.ozlabs.org/patch/413211/

might help with that. Let me think some more.
>
>> ---
>>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>>   1 file changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index 9f5eb55..ce173f0 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
>>                                  afspec, RTM_SETLINK);
>>          }
>>
>> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>> +                       dev->netdev_ops->ndo_bridge_setlink) {
>> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev, nlh);
> I think you want to up-level this to net/core/rtnetlink.c because
> you're only enabling the feature for one instance of a driver that
> implements ndo_bridge_setlink: the bridge driver.  If another driver
> was MASTER and implemented ndo_bridge_setlink, you'd want same check
> to push setting down to SELF port driver.

yeah, i thought about that. But i moved it here so that rollback would 
be easier.
>
>> +               if (ret && ret != -EOPNOTSUPP) {
>> +                       /* XXX Fix this in the future to rollback
>> +                        * kernel settings and return error
>> +                        */
> The future is now.  Let's fix this now for the rollback case (again up
> in rtnetlink.c).  So then a general question comes to mind: for these
> dual target sets, is it best to try HW first and then SW, or the other
> way around?  Either way, on failure on second you need to rollback
> first.  And, on failure, you need to know rollback value for first, so
> you have to do a getlink on first before attempting set.
yep, exactly, I went through the same thought process yesterday when i 
was trying to implement rollback.
>
>> +                       br_warn(p->br, "error offloading bridge attributes "
>> +                                       "on port %u(%s)\n", (unsigned int) p->port_no,
>> +                                       p->dev->name);
>> +               }
>> +       }
>> +
>>          if (err == 0)
>>                  br_ifinfo_notify(RTM_NEWLINK, p);
>> -
>>   out:
>>          return err;
>>   }
>> @@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct nlmsghdr *nlh)
>>          err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
>>                          afspec, RTM_DELLINK);
>>
>> +       if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
>> +                       && dev->netdev_ops->ndo_bridge_setlink) {
>> +               int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);
>> +               if (ret && ret != -EOPNOTSUPP) {
>> +                       /* XXX Fix this in the future to rollback
>> +                        * kernel settings and return error
>> +                        */
>> +                       br_warn(p->br, "error offloading bridge attributes "
>> +                                       "on port %u(%s)\n", (unsigned int) p->port_no,
>> +                                       p->dev->name);
>> +               }
>> +       }
>> +
> Same comments as setlink above.
>
>>          return err;
>>   }
>>   static int br_validate(struct nlattr *tb[], struct nlattr *data[])
>> --
>> 1.7.10.4
>>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roopa Prabhu Dec. 5, 2014, 7:10 a.m. UTC | #4
On 12/4/14, 10:55 PM, John Fastabend wrote:
> On 12/04/2014 10:41 PM, Scott Feldman wrote:
>> On Thu, Dec 4, 2014 at 6:26 PM, <roopa@cumulusnetworks.com> wrote:
>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>
>>> This allows offloading to switch asic without having the user to set
>>> any flag. And this is done in the bridge driver to rollback kernel 
>>> settings
>>> on hw offload failure if required in the future.
>>>
>>> With this, it also makes sure a notification goes out only after the
>>> attributes are set both in the kernel and hw.
>>
>> I like this approach as it streamlines the steps for the user in
>> setting port flags.  There is one case for FLOODING where you'll have
>> to turn off flooding for both, and then turn on flooding in hw. You
>> don't want flooding turned on on kernel and hw.
>>
>>> ---
>>>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>>>   1 file changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>> index 9f5eb55..ce173f0 100644
>>> --- a/net/bridge/br_netlink.c
>>> +++ b/net/bridge/br_netlink.c
>>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct 
>>> nlmsghdr *nlh)
>>>                                  afspec, RTM_SETLINK);
>>>          }
>>>
>>> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>>> + dev->netdev_ops->ndo_bridge_setlink) {
>>> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev, 
>>> nlh);
>>
>> I think you want to up-level this to net/core/rtnetlink.c because
>> you're only enabling the feature for one instance of a driver that
>> implements ndo_bridge_setlink: the bridge driver.  If another driver
>> was MASTER and implemented ndo_bridge_setlink, you'd want same check
>> to push setting down to SELF port driver.
>
> Also if the user set SELF && MASTER flags && had HW_SWITCH_OFFLOAD bit
> set we would call ndo_bridge_setlink twice on the dev. Maybe you can
> catch this case in rtnetlink.c and only call it once.

yes, thought about this and when i looked at iproute2 code, it is either 
master
or self today and i don't think anybody else uses both flags with the 
current
kernel implementation. But yes, that does not stop anybody from setting 
both flags.
I will handle it better in v2.
>
>>
>>> +               if (ret && ret != -EOPNOTSUPP) {
>>> +                       /* XXX Fix this in the future to rollback
>>> +                        * kernel settings and return error
>>> +                        */
>>
>> The future is now.  Let's fix this now for the rollback case (again up
>> in rtnetlink.c).  So then a general question comes to mind: for these
>> dual target sets, is it best to try HW first and then SW, or the other
>> way around?  Either way, on failure on second you need to rollback
>> first.  And, on failure, you need to know rollback value for first, so
>> you have to do a getlink on first before attempting set.
>
> It might be helpful to return some indication of what object failed as
> well.

ok...

thanks,
Roopa

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko Dec. 5, 2014, 7:38 a.m. UTC | #5
Fri, Dec 05, 2014 at 03:26:40AM CET, roopa@cumulusnetworks.com wrote:
>From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
>This allows offloading to switch asic without having the user to set
>any flag. And this is done in the bridge driver to rollback kernel settings
>on hw offload failure if required in the future.
>
>With this, it also makes sure a notification goes out only after the
>attributes are set both in the kernel and hw.
>---
> net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
> 1 file changed, 26 insertions(+), 1 deletion(-)
>
>diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>index 9f5eb55..ce173f0 100644
>--- a/net/bridge/br_netlink.c
>+++ b/net/bridge/br_netlink.c
>@@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
> 				afspec, RTM_SETLINK);
> 	}
> 
>+	if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>+			dev->netdev_ops->ndo_bridge_setlink) {
>+		int ret = dev->netdev_ops->ndo_bridge_setlink(dev, nlh);

	This (and I suspect other patches as well) has many issues which
	are pointed out by scripts/checkpatch.pl sctipts. For example
	here, there should be an empty line. Please run your patches by
	this script before you send them.

>+		if (ret && ret != -EOPNOTSUPP) {
>+			/* XXX Fix this in the future to rollback
>+			 * kernel settings and return error
>+			 */
>+			br_warn(p->br, "error offloading bridge attributes " 
>+					"on port %u(%s)\n", (unsigned int) p->port_no,
>+					p->dev->name);
>+		}
>+	}
>+
> 	if (err == 0)
> 		br_ifinfo_notify(RTM_NEWLINK, p);
>-
> out:
> 	return err;
> }
>@@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct nlmsghdr *nlh)
> 	err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
> 			afspec, RTM_DELLINK);
> 
>+	if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
>+			&& dev->netdev_ops->ndo_bridge_setlink) {
>+		int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);

	c&p issue, there should be check for dellink here.

>+		if (ret && ret != -EOPNOTSUPP) {
>+			/* XXX Fix this in the future to rollback
>+			 * kernel settings and return error
>+			 */
>+			br_warn(p->br, "error offloading bridge attributes " 
>+					"on port %u(%s)\n", (unsigned int) p->port_no,
>+					p->dev->name);
>+		}
>+	}
>+


	I agree with Scott that this code should be moved to rtnetlink.c

> 	return err;
> }
> static int br_validate(struct nlattr *tb[], struct nlattr *data[])
>-- 
>1.7.10.4
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jamal Hadi Salim Dec. 5, 2014, 12:07 p.m. UTC | #6
On 12/04/14 21:26, roopa@cumulusnetworks.com wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>

Ok, so this one maintains status quo. Looks reasonable
to me.

cheers,
jamal

> This allows offloading to switch asic without having the user to set
> any flag. And this is done in the bridge driver to rollback kernel settings
> on hw offload failure if required in the future.
>
> With this, it also makes sure a notification goes out only after the
> attributes are set both in the kernel and hw.
> ---
>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>   1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 9f5eb55..ce173f0 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
>   				afspec, RTM_SETLINK);
>   	}
>
> +	if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
> +			dev->netdev_ops->ndo_bridge_setlink) {
> +		int ret = dev->netdev_ops->ndo_bridge_setlink(dev, nlh);
> +		if (ret && ret != -EOPNOTSUPP) {
> +			/* XXX Fix this in the future to rollback
> +			 * kernel settings and return error
> +			 */
> +			br_warn(p->br, "error offloading bridge attributes "
> +					"on port %u(%s)\n", (unsigned int) p->port_no,
> +					p->dev->name);
> +		}
> +	}
> +
>   	if (err == 0)
>   		br_ifinfo_notify(RTM_NEWLINK, p);
> -
>   out:
>   	return err;
>   }
> @@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct nlmsghdr *nlh)
>   	err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
>   			afspec, RTM_DELLINK);
>
> +	if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
> +			&& dev->netdev_ops->ndo_bridge_setlink) {
> +		int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);
> +		if (ret && ret != -EOPNOTSUPP) {
> +			/* XXX Fix this in the future to rollback
> +			 * kernel settings and return error
> +			 */
> +			br_warn(p->br, "error offloading bridge attributes "
> +					"on port %u(%s)\n", (unsigned int) p->port_no,
> +					p->dev->name);
> +		}
> +	}
> +
>   	return err;
>   }
>   static int br_validate(struct nlattr *tb[], struct nlattr *data[])
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jamal Hadi Salim Dec. 5, 2014, 12:41 p.m. UTC | #7
On 12/05/14 02:10, Roopa Prabhu wrote:
> On 12/4/14, 10:55 PM, John Fastabend wrote:
>> On 12/04/2014 10:41 PM, Scott Feldman wrote:
>>> On Thu, Dec 4, 2014 at 6:26 PM, <roopa@cumulusnetworks.com> wrote:
>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>
>>>> This allows offloading to switch asic without having the user to set
>>>> any flag. And this is done in the bridge driver to rollback kernel
>>>> settings
>>>> on hw offload failure if required in the future.
>>>>
>>>> With this, it also makes sure a notification goes out only after the
>>>> attributes are set both in the kernel and hw.
>>>
>>> I like this approach as it streamlines the steps for the user in
>>> setting port flags.  There is one case for FLOODING where you'll have
>>> to turn off flooding for both, and then turn on flooding in hw. You
>>> don't want flooding turned on on kernel and hw.
>>>
>>>> ---
>>>>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>>>>   1 file changed, 26 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>>> index 9f5eb55..ce173f0 100644
>>>> --- a/net/bridge/br_netlink.c
>>>> +++ b/net/bridge/br_netlink.c
>>>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct
>>>> nlmsghdr *nlh)
>>>>                                  afspec, RTM_SETLINK);
>>>>          }
>>>>
>>>> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>>>> + dev->netdev_ops->ndo_bridge_setlink) {
>>>> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev,
>>>> nlh);
>>>
>>> I think you want to up-level this to net/core/rtnetlink.c because
>>> you're only enabling the feature for one instance of a driver that
>>> implements ndo_bridge_setlink: the bridge driver.  If another driver
>>> was MASTER and implemented ndo_bridge_setlink, you'd want same check
>>> to push setting down to SELF port driver.
>>
>> Also if the user set SELF && MASTER flags && had HW_SWITCH_OFFLOAD bit
>> set we would call ndo_bridge_setlink twice on the dev. Maybe you can
>> catch this case in rtnetlink.c and only call it once.
>
> yes, thought about this and when i looked at iproute2 code, it is either
> master
> or self today and i don't think anybody else uses both flags with the
> current
> kernel implementation. But yes, that does not stop anybody from setting
> both flags.
> I will handle it better in v2.

folks, can we have probably 2-3 sets of patches?
#1 introduces the flags and doesnt change anything.
Others to introduce other features.

cheers,
jamal


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sergei Shtylyov Dec. 5, 2014, 1:21 p.m. UTC | #8
Hello.

On 12/5/2014 5:26 AM, roopa@cumulusnetworks.com wrote:

> From: Roopa Prabhu <roopa@cumulusnetworks.com>

> This allows offloading to switch asic without having the user to set
> any flag. And this is done in the bridge driver to rollback kernel settings
> on hw offload failure if required in the future.

> With this, it also makes sure a notification goes out only after the
> attributes are set both in the kernel and hw.
> ---
>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>   1 file changed, 26 insertions(+), 1 deletion(-)

> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 9f5eb55..ce173f0 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
>   				afspec, RTM_SETLINK);
>   	}
>
> +	if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
> +			dev->netdev_ops->ndo_bridge_setlink) {
> +		int ret = dev->netdev_ops->ndo_bridge_setlink(dev, nlh);

    Need empty line after declaration.

> +		if (ret && ret != -EOPNOTSUPP) {
> +			/* XXX Fix this in the future to rollback
> +			 * kernel settings and return error
> +			 */
> +			br_warn(p->br, "error offloading bridge attributes "
> +					"on port %u(%s)\n", (unsigned int) p->port_no,

    Don't break up the message strings. Also, your continuation lines should 
start under the next character after ( on the first line.

> +					p->dev->name);
> +		}
> +	}
> +
>   	if (err == 0)
>   		br_ifinfo_notify(RTM_NEWLINK, p);
> -

    Why?

>   out:
>   	return err;
>   }
> @@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct nlmsghdr *nlh)
>   	err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
>   			afspec, RTM_DELLINK);
>
> +	if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
> +			&& dev->netdev_ops->ndo_bridge_setlink) {
> +		int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);
> +		if (ret && ret != -EOPNOTSUPP) {
> +			/* XXX Fix this in the future to rollback
> +			 * kernel settings and return error
> +			 */
> +			br_warn(p->br, "error offloading bridge attributes "
> +					"on port %u(%s)\n", (unsigned int) p->port_no,
> +					p->dev->name);

    Same comments here.

> +		}
> +	}
> +
>   	return err;
>   }
>   static int br_validate(struct nlattr *tb[], struct nlattr *data[])

WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roopa Prabhu Dec. 5, 2014, 1:44 p.m. UTC | #9
On 12/4/14, 11:38 PM, Jiri Pirko wrote:
> Fri, Dec 05, 2014 at 03:26:40AM CET, roopa@cumulusnetworks.com wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> This allows offloading to switch asic without having the user to set
>> any flag. And this is done in the bridge driver to rollback kernel settings
>> on hw offload failure if required in the future.
>>
>> With this, it also makes sure a notification goes out only after the
>> attributes are set both in the kernel and hw.
>> ---
>> net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>> 1 file changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index 9f5eb55..ce173f0 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
>> 				afspec, RTM_SETLINK);
>> 	}
>>
>> +	if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>> +			dev->netdev_ops->ndo_bridge_setlink) {
>> +		int ret = dev->netdev_ops->ndo_bridge_setlink(dev, nlh);
> 	This (and I suspect other patches as well) has many issues which
> 	are pointed out by scripts/checkpatch.pl sctipts. For example
> 	here, there should be an empty line. Please run your patches by
> 	this script before you send them.

ack, sure thing (I usually do but forgot yesterday).

>
>> +		if (ret && ret != -EOPNOTSUPP) {
>> +			/* XXX Fix this in the future to rollback
>> +			 * kernel settings and return error
>> +			 */
>> +			br_warn(p->br, "error offloading bridge attributes "
>> +					"on port %u(%s)\n", (unsigned int) p->port_no,
>> +					p->dev->name);
>> +		}
>> +	}
>> +
>> 	if (err == 0)
>> 		br_ifinfo_notify(RTM_NEWLINK, p);
>> -
>> out:
>> 	return err;
>> }
>> @@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct nlmsghdr *nlh)
>> 	err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
>> 			afspec, RTM_DELLINK);
>>
>> +	if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
>> +			&& dev->netdev_ops->ndo_bridge_setlink) {
>> +		int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);
> 	c&p issue, there should be check for dellink here.
>
>> +		if (ret && ret != -EOPNOTSUPP) {
>> +			/* XXX Fix this in the future to rollback
>> +			 * kernel settings and return error
>> +			 */
>> +			br_warn(p->br, "error offloading bridge attributes "
>> +					"on port %u(%s)\n", (unsigned int) p->port_no,
>> +					p->dev->name);
>> +		}
>> +	}
>> +
>
> 	I agree with Scott that this code should be moved to rtnetlink.c
>
>> 	return err;
>> }
>> static int br_validate(struct nlattr *tb[], struct nlattr *data[])
>> -- 
>> 1.7.10.4
>>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roopa Prabhu Dec. 5, 2014, 2:03 p.m. UTC | #10
On 12/5/14, 4:41 AM, Jamal Hadi Salim wrote:
> On 12/05/14 02:10, Roopa Prabhu wrote:
>> On 12/4/14, 10:55 PM, John Fastabend wrote:
>>> On 12/04/2014 10:41 PM, Scott Feldman wrote:
>>>> On Thu, Dec 4, 2014 at 6:26 PM, <roopa@cumulusnetworks.com> wrote:
>>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>>
>>>>> This allows offloading to switch asic without having the user to set
>>>>> any flag. And this is done in the bridge driver to rollback kernel
>>>>> settings
>>>>> on hw offload failure if required in the future.
>>>>>
>>>>> With this, it also makes sure a notification goes out only after the
>>>>> attributes are set both in the kernel and hw.
>>>>
>>>> I like this approach as it streamlines the steps for the user in
>>>> setting port flags.  There is one case for FLOODING where you'll have
>>>> to turn off flooding for both, and then turn on flooding in hw. You
>>>> don't want flooding turned on on kernel and hw.
>>>>
>>>>> ---
>>>>>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>>>>>   1 file changed, 26 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>>>> index 9f5eb55..ce173f0 100644
>>>>> --- a/net/bridge/br_netlink.c
>>>>> +++ b/net/bridge/br_netlink.c
>>>>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct
>>>>> nlmsghdr *nlh)
>>>>>                                  afspec, RTM_SETLINK);
>>>>>          }
>>>>>
>>>>> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>>>>> + dev->netdev_ops->ndo_bridge_setlink) {
>>>>> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev,
>>>>> nlh);
>>>>
>>>> I think you want to up-level this to net/core/rtnetlink.c because
>>>> you're only enabling the feature for one instance of a driver that
>>>> implements ndo_bridge_setlink: the bridge driver.  If another driver
>>>> was MASTER and implemented ndo_bridge_setlink, you'd want same check
>>>> to push setting down to SELF port driver.
>>>
>>> Also if the user set SELF && MASTER flags && had HW_SWITCH_OFFLOAD bit
>>> set we would call ndo_bridge_setlink twice on the dev. Maybe you can
>>> catch this case in rtnetlink.c and only call it once.
>>
>> yes, thought about this and when i looked at iproute2 code, it is either
>> master
>> or self today and i don't think anybody else uses both flags with the
>> current
>> kernel implementation. But yes, that does not stop anybody from setting
>> both flags.
>> I will handle it better in v2.
>
> folks, can we have probably 2-3 sets of patches?
> #1 introduces the flags and doesnt change anything.
> Others to introduce other features.
sure, I was thinking about doing that. I plan to send the "swdev" mode 
related patches first.
And the others as separate sets.




> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roopa Prabhu Dec. 5, 2014, 2:04 p.m. UTC | #11
ack again, will fix the formatting issues. thanks

On 12/5/14, 5:21 AM, Sergei Shtylyov wrote:
> Hello.
>
> On 12/5/2014 5:26 AM, roopa@cumulusnetworks.com wrote:
>
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
>> This allows offloading to switch asic without having the user to set
>> any flag. And this is done in the bridge driver to rollback kernel 
>> settings
>> on hw offload failure if required in the future.
>
>> With this, it also makes sure a notification goes out only after the
>> attributes are set both in the kernel and hw.
>> ---
>>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>>   1 file changed, 26 insertions(+), 1 deletion(-)
>
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index 9f5eb55..ce173f0 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct 
>> nlmsghdr *nlh)
>>                   afspec, RTM_SETLINK);
>>       }
>>
>> +    if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>> +            dev->netdev_ops->ndo_bridge_setlink) {
>> +        int ret = dev->netdev_ops->ndo_bridge_setlink(dev, nlh);
>
>    Need empty line after declaration.
>
>> +        if (ret && ret != -EOPNOTSUPP) {
>> +            /* XXX Fix this in the future to rollback
>> +             * kernel settings and return error
>> +             */
>> +            br_warn(p->br, "error offloading bridge attributes "
>> +                    "on port %u(%s)\n", (unsigned int) p->port_no,
>
>    Don't break up the message strings. Also, your continuation lines 
> should start under the next character after ( on the first line.
>
>> +                    p->dev->name);
>> +        }
>> +    }
>> +
>>       if (err == 0)
>>           br_ifinfo_notify(RTM_NEWLINK, p);
>> -
>
>    Why?
>
>>   out:
>>       return err;
>>   }
>> @@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct 
>> nlmsghdr *nlh)
>>       err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
>>               afspec, RTM_DELLINK);
>>
>> +    if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
>> +            && dev->netdev_ops->ndo_bridge_setlink) {
>> +        int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);
>> +        if (ret && ret != -EOPNOTSUPP) {
>> +            /* XXX Fix this in the future to rollback
>> +             * kernel settings and return error
>> +             */
>> +            br_warn(p->br, "error offloading bridge attributes "
>> +                    "on port %u(%s)\n", (unsigned int) p->port_no,
>> +                    p->dev->name);
>
>    Same comments here.
>
>> +        }
>> +    }
>> +
>>       return err;
>>   }
>>   static int br_validate(struct nlattr *tb[], struct nlattr *data[])
>
> WBR, Sergei
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roopa Prabhu Dec. 5, 2014, 2:37 p.m. UTC | #12
On 12/4/14, 11:38 PM, Jiri Pirko wrote:
> Fri, Dec 05, 2014 at 03:26:40AM CET, roopa@cumulusnetworks.com wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> This allows offloading to switch asic without having the user to set
>> any flag. And this is done in the bridge driver to rollback kernel settings
>> on hw offload failure if required in the future.
>>
>> With this, it also makes sure a notification goes out only after the
>> attributes are set both in the kernel and hw.
>> ---
>> net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>> 1 file changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index 9f5eb55..ce173f0 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
>> 				afspec, RTM_SETLINK);
>> 	}
>>
>> +	if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>> +			dev->netdev_ops->ndo_bridge_setlink) {
>> +		int ret = dev->netdev_ops->ndo_bridge_setlink(dev, nlh);
> 	This (and I suspect other patches as well) has many issues which
> 	are pointed out by scripts/checkpatch.pl sctipts. For example
> 	here, there should be an empty line. Please run your patches by
> 	this script before you send them.
>
>> +		if (ret && ret != -EOPNOTSUPP) {
>> +			/* XXX Fix this in the future to rollback
>> +			 * kernel settings and return error
>> +			 */
>> +			br_warn(p->br, "error offloading bridge attributes "
>> +					"on port %u(%s)\n", (unsigned int) p->port_no,
>> +					p->dev->name);
>> +		}
>> +	}
>> +
>> 	if (err == 0)
>> 		br_ifinfo_notify(RTM_NEWLINK, p);
>> -
>> out:
>> 	return err;
>> }
>> @@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct nlmsghdr *nlh)
>> 	err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
>> 			afspec, RTM_DELLINK);
>>
>> +	if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
>> +			&& dev->netdev_ops->ndo_bridge_setlink) {
>> +		int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);
> 	c&p issue, there should be check for dellink here.
>
>> +		if (ret && ret != -EOPNOTSUPP) {
>> +			/* XXX Fix this in the future to rollback
>> +			 * kernel settings and return error
>> +			 */
>> +			br_warn(p->br, "error offloading bridge attributes "
>> +					"on port %u(%s)\n", (unsigned int) p->port_no,
>> +					p->dev->name);
>> +		}
>> +	}
>> +
>
> 	I agree with Scott that this code should be moved to rtnetlink.c

I moved it here to make rollback easier. Plus i don't want software 
notification to go out before hardware is programmed.
And the software notification goes out from here.

I don't intend to implement rollback in v2 (that will be a separate series).
  But, i do want to take care of the notification problem.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arad, Ronen Dec. 5, 2014, 11:21 p.m. UTC | #13
> -----Original Message-----

> From: netdev-owner@vger.kernel.org [mailto:netdev-

> owner@vger.kernel.org] On Behalf Of Roopa Prabhu

> Sent: Thursday, December 04, 2014 11:02 PM

> To: Scott Feldman

> Cc: Jiří Pírko; Jamal Hadi Salim; Benjamin LaHaise; Thomas Graf; john

> fastabend; stephen@networkplumber.org; John Linville;

> nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com; Florian

> Fainelli; buytenh@wantstofly.org; Aviad Raveh; Netdev; David S. Miller;

> shm@cumulusnetworks.com; Andy Gospodarek

> Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to switch asic

> if feature flag set

> 

> On 12/4/14, 10:41 PM, Scott Feldman wrote:

> > On Thu, Dec 4, 2014 at 6:26 PM,  <roopa@cumulusnetworks.com> wrote:

> >> From: Roopa Prabhu <roopa@cumulusnetworks.com>

> >>

> >> This allows offloading to switch asic without having the user to set

> >> any flag. And this is done in the bridge driver to rollback kernel

> >> settings on hw offload failure if required in the future.

> >>

> >> With this, it also makes sure a notification goes out only after the

> >> attributes are set both in the kernel and hw.

> > I like this approach as it streamlines the steps for the user in

> > setting port flags.  There is one case for FLOODING where you'll have

> > to turn off flooding for both, and then turn on flooding in hw.  You

> > don't want flooding turned on on kernel and hw.

> ok, maybe using the higher bits as in

> https://patchwork.ozlabs.org/patch/413211/

> 

> might help with that. Let me think some more.

> >

> >> ---

> >>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-

> >>   1 file changed, 26 insertions(+), 1 deletion(-)

> >>

> >> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index

> >> 9f5eb55..ce173f0 100644

> >> --- a/net/bridge/br_netlink.c

> >> +++ b/net/bridge/br_netlink.c

> >> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct

> nlmsghdr *nlh)

> >>                                  afspec, RTM_SETLINK);

> >>          }

> >>

> >> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&

> >> +                       dev->netdev_ops->ndo_bridge_setlink) {

> >> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev,

> >> + nlh);

> > I think you want to up-level this to net/core/rtnetlink.c because

> > you're only enabling the feature for one instance of a driver that

> > implements ndo_bridge_setlink: the bridge driver.  If another driver

> > was MASTER and implemented ndo_bridge_setlink, you'd want same check

> > to push setting down to SELF port driver.

> 

> yeah, i thought about that. But i moved it here so that rollback would be

> easier.


There is a need for propagating setlink/dellink requests down multiple levels.
The use-case I have in mind is a bridge at the top, team/bond in the middle, and port devices at the bottom.
A setlink for VLAN filtering attributes would come with MASTER flag set, and either port or bond/team netdev.
How would this be handled?

The propagation rules between bridge and enslaved port device could be different from those between bond/team and enslaved devices.
The current bridge driver does not propagate VLAN filtering from bridge to its ports as each port could have different configuration. In a case of a bond/team all members need to have the same configuration such that the a bond/team would be indistinguishable from a simple port.

Therefore rtnetlink.c might not have the knowledge for propagation across multiple levels.
It seems that each device which implements ndo_bridge_setlink/ndo_bridge_dellink  and could have master role, need to take care of propagation to its slaves.
 
> >

> >> +               if (ret && ret != -EOPNOTSUPP) {

> >> +                       /* XXX Fix this in the future to rollback

> >> +                        * kernel settings and return error

> >> +                        */

> > The future is now.  Let's fix this now for the rollback case (again up

> > in rtnetlink.c).  So then a general question comes to mind: for these

> > dual target sets, is it best to try HW first and then SW, or the other

> > way around?  Either way, on failure on second you need to rollback

> > first.  And, on failure, you need to know rollback value for first, so

> > you have to do a getlink on first before attempting set.

> yep, exactly, I went through the same thought process yesterday when i was

> trying to implement rollback.

> >

> >> +                       br_warn(p->br, "error offloading bridge attributes "

> >> +                                       "on port %u(%s)\n", (unsigned int) p->port_no,

> >> +                                       p->dev->name);

> >> +               }

> >> +       }

> >> +

> >>          if (err == 0)

> >>                  br_ifinfo_notify(RTM_NEWLINK, p);

> >> -

> >>   out:

> >>          return err;

> >>   }

> >> @@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct

> nlmsghdr *nlh)

> >>          err = br_afspec((struct net_bridge *)netdev_priv(dev), p,

> >>                          afspec, RTM_DELLINK);

> >>

> >> +       if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD

> >> +                       && dev->netdev_ops->ndo_bridge_setlink) {

> >> +               int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);

> >> +               if (ret && ret != -EOPNOTSUPP) {

> >> +                       /* XXX Fix this in the future to rollback

> >> +                        * kernel settings and return error

> >> +                        */

> >> +                       br_warn(p->br, "error offloading bridge attributes "

> >> +                                       "on port %u(%s)\n", (unsigned int) p->port_no,

> >> +                                       p->dev->name);

> >> +               }

> >> +       }

> >> +

> > Same comments as setlink above.

> >

> >>          return err;

> >>   }

> >>   static int br_validate(struct nlattr *tb[], struct nlattr *data[])

> >> --

> >> 1.7.10.4

> >>

> 

> --

> To unsubscribe from this list: send the line "unsubscribe netdev" in the body

> of a message to majordomo@vger.kernel.org More majordomo info at

> http://vger.kernel.org/majordomo-info.html
John Fastabend Dec. 6, 2014, 2:46 a.m. UTC | #14
On 12/05/2014 05:04 PM, Arad, Ronen wrote:
> I have another case of propagation which is not covered by the proposed patch.
> A recent patch introduced default_pvid attribute for a bridge (so far supported only via sysfs and not via netlink).
> When a port joins a bridge, it inherits a PVID from the default_pvid of the bridge.
> The bridge driver propagates that to the newly created net_bridge_port. This is done in br_vlan.c:
>
> int nbp_vlan_init(struct net_bridge_port *p)
> {
> 	int rc = 0;
>
> 	if (p->br->default_pvid) {
> 		rc = nbp_vlan_add(p, p->br->default_pvid,
> 				  BRIDGE_VLAN_INFO_PVID |
> 				  BRIDGE_VLAN_INFO_UNTAGGED);
> 	}
>
> 	return rc;
> }
>
> When L2 switching is offloaded to the HW, this PVID setting need to be propagated. However, it does not come via ndo_bridge_setlink. The proposed propagation at br_setlink or an up level one at rtnetlink are not capable of handling this case.
> One possible way for handling that is to replace the call to nbp_vlan_add with a call to a new function let's say
> int br_propagate_vlan_add(struct net_bridge_port *port, u16 vid, u16 flags)
> This function will compose a netlink message with VLAN filtering information (i.e. AF_SPEC with VLAN_INFO) and call br_setlink - leveraging the offload support proposed by Roopa.
>

No, we shouldn't be crafting netlink messages in the kernel just
re-inject them into an interface. Really the setlink/dellink interface
should be cleaned up so that it no longer consumes raw netlink messages.

Then either (a) add another parameter to the setlink ops or (b) create
a new op for it.

I think cleaning up the setlink/dellink hooks is on the TBD list
already.


> If this is an acceptable course of action, I could work on such patch.
>
>

[...]

Thanks,
John
Arad, Ronen Dec. 6, 2014, 3:06 a.m. UTC | #15
> -----Original Message-----

> From: John Fastabend [mailto:john.fastabend@gmail.com]

> Sent: Friday, December 05, 2014 6:46 PM

> To: Arad, Ronen

> Cc: Roopa Prabhu; Scott Feldman; Netdev; Jirí Pírko; Jamal Hadi Salim;

> Benjamin LaHaise; Thomas Graf; stephen@networkplumber.org; John

> Linville; nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com;

> Florian Fainelli; buytenh@wantstofly.org; Aviad Raveh; David S. Miller;

> shm@cumulusnetworks.com; Andy Gospodarek

> Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to switch asic

> if feature flag set

> 

> On 12/05/2014 05:04 PM, Arad, Ronen wrote:

> > I have another case of propagation which is not covered by the proposed

> patch.

> > A recent patch introduced default_pvid attribute for a bridge (so far

> supported only via sysfs and not via netlink).

> > When a port joins a bridge, it inherits a PVID from the default_pvid of the

> bridge.

> > The bridge driver propagates that to the newly created net_bridge_port.

> This is done in br_vlan.c:

> >

> > int nbp_vlan_init(struct net_bridge_port *p) {

> > 	int rc = 0;

> >

> > 	if (p->br->default_pvid) {

> > 		rc = nbp_vlan_add(p, p->br->default_pvid,

> > 				  BRIDGE_VLAN_INFO_PVID |

> > 				  BRIDGE_VLAN_INFO_UNTAGGED);

> > 	}

> >

> > 	return rc;

> > }

> >

> > When L2 switching is offloaded to the HW, this PVID setting need to be

> propagated. However, it does not come via ndo_bridge_setlink. The

> proposed propagation at br_setlink or an up level one at rtnetlink are not

> capable of handling this case.

> > One possible way for handling that is to replace the call to

> > nbp_vlan_add with a call to a new function let's say int

> > br_propagate_vlan_add(struct net_bridge_port *port, u16 vid, u16 flags)

> This function will compose a netlink message with VLAN filtering information

> (i.e. AF_SPEC with VLAN_INFO) and call br_setlink - leveraging the offload

> support proposed by Roopa.

> >

> 

> No, we shouldn't be crafting netlink messages in the kernel just re-inject

> them into an interface. Really the setlink/dellink interface should be cleaned

> up so that it no longer consumes raw netlink messages.

> 

> Then either (a) add another parameter to the setlink ops or (b) create a new

> op for it.

> 

> I think cleaning up the setlink/dellink hooks is on the TBD list already.

> 

This would be a lot cleaner even though there could be loss of flexibility. Fixed argument interface will not be extensible.
Will the non-Netlink based driver setlink/dellink hooks be TLV based or take a pointer to a single struct with some indication of what is actually populated there? 
> 

> > If this is an acceptable course of action, I could work on such patch.

> >

> >

> 

> [...]

> 

> Thanks,

> John

> 

> --

> John Fastabend         Intel Corporation
John Fastabend Dec. 6, 2014, 3:21 a.m. UTC | #16
On 12/05/2014 07:06 PM, Arad, Ronen wrote:
>
>
>> -----Original Message-----
>> From: John Fastabend [mailto:john.fastabend@gmail.com]
>> Sent: Friday, December 05, 2014 6:46 PM
>> To: Arad, Ronen
>> Cc: Roopa Prabhu; Scott Feldman; Netdev; Jirí Pírko; Jamal Hadi Salim;
>> Benjamin LaHaise; Thomas Graf; stephen@networkplumber.org; John
>> Linville; nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com;
>> Florian Fainelli; buytenh@wantstofly.org; Aviad Raveh; David S. Miller;
>> shm@cumulusnetworks.com; Andy Gospodarek
>> Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to switch asic
>> if feature flag set
>>
>> On 12/05/2014 05:04 PM, Arad, Ronen wrote:
>>> I have another case of propagation which is not covered by the proposed
>> patch.
>>> A recent patch introduced default_pvid attribute for a bridge (so far
>> supported only via sysfs and not via netlink).
>>> When a port joins a bridge, it inherits a PVID from the default_pvid of the
>> bridge.
>>> The bridge driver propagates that to the newly created net_bridge_port.
>> This is done in br_vlan.c:
>>>
>>> int nbp_vlan_init(struct net_bridge_port *p) {
>>> 	int rc = 0;
>>>
>>> 	if (p->br->default_pvid) {
>>> 		rc = nbp_vlan_add(p, p->br->default_pvid,
>>> 				  BRIDGE_VLAN_INFO_PVID |
>>> 				  BRIDGE_VLAN_INFO_UNTAGGED);
>>> 	}
>>>
>>> 	return rc;
>>> }
>>>
>>> When L2 switching is offloaded to the HW, this PVID setting need to be
>> propagated. However, it does not come via ndo_bridge_setlink. The
>> proposed propagation at br_setlink or an up level one at rtnetlink are not
>> capable of handling this case.
>>> One possible way for handling that is to replace the call to
>>> nbp_vlan_add with a call to a new function let's say int
>>> br_propagate_vlan_add(struct net_bridge_port *port, u16 vid, u16 flags)
>> This function will compose a netlink message with VLAN filtering information
>> (i.e. AF_SPEC with VLAN_INFO) and call br_setlink - leveraging the offload
>> support proposed by Roopa.
>>>
>>
>> No, we shouldn't be crafting netlink messages in the kernel just re-inject
>> them into an interface. Really the setlink/dellink interface should be cleaned
>> up so that it no longer consumes raw netlink messages.
>>
>> Then either (a) add another parameter to the setlink ops or (b) create a new
>> op for it.
>>
>> I think cleaning up the setlink/dellink hooks is on the TBD list already.
>>
> This would be a lot cleaner even though there could be loss of
> flexibility. Fixed argument interface will not be extensible.
> Will the non-Netlink based driver setlink/dellink hooks be TLV based
> or take a pointer to a single struct with some indication of what is
> actually populated there?

There shouldn't be any loss of flexibility, we can add new attributes
and new ops as we need them.

I had assumed it would be basic structures and additional ndo ops as
needed but I've not coded anything up.

>>> If this is an acceptable course of action, I could work on such patch.
>>>
>>>
>>
>> [...]
>>
>> Thanks,
>> John
>>
>> --
>> John Fastabend         Intel Corporation
Scott Feldman Dec. 6, 2014, 6:29 a.m. UTC | #17
On Fri, Dec 5, 2014 at 5:04 PM, Arad, Ronen <ronen.arad@intel.com> wrote:
> I have another case of propagation which is not covered by the proposed patch.
> A recent patch introduced default_pvid attribute for a bridge (so far supported only via sysfs and not via netlink).
> When a port joins a bridge, it inherits a PVID from the default_pvid of the bridge.
> The bridge driver propagates that to the newly created net_bridge_port. This is done in br_vlan.c:
>
> int nbp_vlan_init(struct net_bridge_port *p)
> {
>         int rc = 0;
>
>         if (p->br->default_pvid) {
>                 rc = nbp_vlan_add(p, p->br->default_pvid,
>                                   BRIDGE_VLAN_INFO_PVID |
>                                   BRIDGE_VLAN_INFO_UNTAGGED);
>         }
>
>         return rc;
> }
>
> When L2 switching is offloaded to the HW, this PVID setting need to be propagated.

Agreed, it would be nice to have it propagated down, but there is a
non-ideal work-around.  If you set default_pvid=0 to turn off PVID,
then the switch port driver can pick some internal VLAN ID just for HW
purposes in matching untagged pkts.  It's non-ideal because the switch
port driver needs to reserve a block of VLAN IDs for internal usage or
use some other matching mechanism to keep untagged pkts within this
bridge.

Better to have default_pvid value propagated down.  But, default_pvid
is a per-bridge property, not a per-bridge-port property.
RTM_SETLINK/RTM_GETLINK for PF_BRIDGE does have AFSPEC for per-bridge
and PROTINFO for per-bridge-port, so it seems PVID needs to be part of
AFSPEC.

>However, it does not come via ndo_bridge_setlink. The proposed propagation at br_setlink or an up level one at rtnetlink are not capable of handling this case.
> One possible way for handling that is to replace the call to nbp_vlan_add with a call to a new function let's say
> int br_propagate_vlan_add(struct net_bridge_port *port, u16 vid, u16 flags)
> This function will compose a netlink message with VLAN filtering information (i.e. AF_SPEC with VLAN_INFO) and call br_setlink - leveraging the offload support proposed by Roopa.
>
> If this is an acceptable course of action, I could work on such patch.
>
>
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org] On Behalf Of Arad, Ronen
>> Sent: Friday, December 05, 2014 3:21 PM
>> To: Roopa Prabhu; Scott Feldman; Netdev
>> Cc: Jirí Pírko; Jamal Hadi Salim; Benjamin LaHaise; Thomas Graf; john
>> fastabend; stephen@networkplumber.org; John Linville;
>> nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com; Florian
>> Fainelli; buytenh@wantstofly.org; Aviad Raveh; David S. Miller;
>> shm@cumulusnetworks.com; Andy Gospodarek
>> Subject: RE: [PATCH 2/3] bridge: offload bridge port attributes to switch asic
>> if feature flag set
>>
>>
>>
>> > -----Original Message-----
>> > From: netdev-owner@vger.kernel.org [mailto:netdev-
>> > owner@vger.kernel.org] On Behalf Of Roopa Prabhu
>> > Sent: Thursday, December 04, 2014 11:02 PM
>> > To: Scott Feldman
>> > Cc: Jiří Pírko; Jamal Hadi Salim; Benjamin LaHaise; Thomas Graf; john
>> > fastabend; stephen@networkplumber.org; John Linville;
>> > nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com; Florian
>> > Fainelli; buytenh@wantstofly.org; Aviad Raveh; Netdev; David S.
>> > Miller; shm@cumulusnetworks.com; Andy Gospodarek
>> > Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to
>> > switch asic if feature flag set
>> >
>> > On 12/4/14, 10:41 PM, Scott Feldman wrote:
>> > > On Thu, Dec 4, 2014 at 6:26 PM,  <roopa@cumulusnetworks.com> wrote:
>> > >> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>> > >>
>> > >> This allows offloading to switch asic without having the user to
>> > >> set any flag. And this is done in the bridge driver to rollback
>> > >> kernel settings on hw offload failure if required in the future.
>> > >>
>> > >> With this, it also makes sure a notification goes out only after
>> > >> the attributes are set both in the kernel and hw.
>> > > I like this approach as it streamlines the steps for the user in
>> > > setting port flags.  There is one case for FLOODING where you'll
>> > > have to turn off flooding for both, and then turn on flooding in hw.
>> > > You don't want flooding turned on on kernel and hw.
>> > ok, maybe using the higher bits as in
>> > https://patchwork.ozlabs.org/patch/413211/
>> >
>> > might help with that. Let me think some more.
>> > >
>> > >> ---
>> > >>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>> > >>   1 file changed, 26 insertions(+), 1 deletion(-)
>> > >>
>> > >> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> > >> index
>> > >> 9f5eb55..ce173f0 100644
>> > >> --- a/net/bridge/br_netlink.c
>> > >> +++ b/net/bridge/br_netlink.c
>> > >> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct
>> > nlmsghdr *nlh)
>> > >>                                  afspec, RTM_SETLINK);
>> > >>          }
>> > >>
>> > >> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>> > >> +                       dev->netdev_ops->ndo_bridge_setlink) {
>> > >> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev,
>> > >> + nlh);
>> > > I think you want to up-level this to net/core/rtnetlink.c because
>> > > you're only enabling the feature for one instance of a driver that
>> > > implements ndo_bridge_setlink: the bridge driver.  If another driver
>> > > was MASTER and implemented ndo_bridge_setlink, you'd want same
>> check
>> > > to push setting down to SELF port driver.
>> >
>> > yeah, i thought about that. But i moved it here so that rollback would
>> > be easier.
>>
>> There is a need for propagating setlink/dellink requests down multiple levels.
>> The use-case I have in mind is a bridge at the top, team/bond in the middle,
>> and port devices at the bottom.
>> A setlink for VLAN filtering attributes would come with MASTER flag set, and
>> either port or bond/team netdev.
>> How would this be handled?
>>
>> The propagation rules between bridge and enslaved port device could be
>> different from those between bond/team and enslaved devices.
>> The current bridge driver does not propagate VLAN filtering from bridge to its
>> ports as each port could have different configuration. In a case of a
>> bond/team all members need to have the same configuration such that the a
>> bond/team would be indistinguishable from a simple port.
>>
>> Therefore rtnetlink.c might not have the knowledge for propagation across
>> multiple levels.
>> It seems that each device which implements
>> ndo_bridge_setlink/ndo_bridge_dellink  and could have master role, need to
>> take care of propagation to its slaves.
>>
>> > >
>> > >> +               if (ret && ret != -EOPNOTSUPP) {
>> > >> +                       /* XXX Fix this in the future to rollback
>> > >> +                        * kernel settings and return error
>> > >> +                        */
>> > > The future is now.  Let's fix this now for the rollback case (again
>> > > up in rtnetlink.c).  So then a general question comes to mind: for
>> > > these dual target sets, is it best to try HW first and then SW, or
>> > > the other way around?  Either way, on failure on second you need to
>> > > rollback first.  And, on failure, you need to know rollback value
>> > > for first, so you have to do a getlink on first before attempting set.
>> > yep, exactly, I went through the same thought process yesterday when i
>> > was trying to implement rollback.
>> > >
>> > >> +                       br_warn(p->br, "error offloading bridge attributes "
>> > >> +                                       "on port %u(%s)\n", (unsigned int) p->port_no,
>> > >> +                                       p->dev->name);
>> > >> +               }
>> > >> +       }
>> > >> +
>> > >>          if (err == 0)
>> > >>                  br_ifinfo_notify(RTM_NEWLINK, p);
>> > >> -
>> > >>   out:
>> > >>          return err;
>> > >>   }
>> > >> @@ -433,6 +445,19 @@ int br_dellink(struct net_device *dev, struct
>> > nlmsghdr *nlh)
>> > >>          err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
>> > >>                          afspec, RTM_DELLINK);
>> > >>
>> > >> +       if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
>> > >> +                       && dev->netdev_ops->ndo_bridge_setlink) {
>> > >> +               int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);
>> > >> +               if (ret && ret != -EOPNOTSUPP) {
>> > >> +                       /* XXX Fix this in the future to rollback
>> > >> +                        * kernel settings and return error
>> > >> +                        */
>> > >> +                       br_warn(p->br, "error offloading bridge attributes "
>> > >> +                                       "on port %u(%s)\n", (unsigned int) p->port_no,
>> > >> +                                       p->dev->name);
>> > >> +               }
>> > >> +       }
>> > >> +
>> > > Same comments as setlink above.
>> > >
>> > >>          return err;
>> > >>   }
>> > >>   static int br_validate(struct nlattr *tb[], struct nlattr
>> > >> *data[])
>> > >> --
>> > >> 1.7.10.4
>> > >>
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe netdev" in
>> > the body of a message to majordomo@vger.kernel.org More majordomo
>> info
>> > at http://vger.kernel.org/majordomo-info.html
>>   {.n +       +%  lzwm  b 맲  r  zw u   ^n r   z    h    &    G   h  ( 階 ݢj"     m     z ޖ
>> f   h   ~ m
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Scott Feldman Dec. 6, 2014, 6:54 a.m. UTC | #18
On Fri, Dec 5, 2014 at 3:21 PM, Arad, Ronen <ronen.arad@intel.com> wrote:
>
>
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org] On Behalf Of Roopa Prabhu
>> Sent: Thursday, December 04, 2014 11:02 PM
>> To: Scott Feldman
>> Cc: Jiří Pírko; Jamal Hadi Salim; Benjamin LaHaise; Thomas Graf; john
>> fastabend; stephen@networkplumber.org; John Linville;
>> nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com; Florian
>> Fainelli; buytenh@wantstofly.org; Aviad Raveh; Netdev; David S. Miller;
>> shm@cumulusnetworks.com; Andy Gospodarek
>> Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to switch asic
>> if feature flag set
>>
>> On 12/4/14, 10:41 PM, Scott Feldman wrote:
>> > On Thu, Dec 4, 2014 at 6:26 PM,  <roopa@cumulusnetworks.com> wrote:
>> >> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>> >>
>> >> This allows offloading to switch asic without having the user to set
>> >> any flag. And this is done in the bridge driver to rollback kernel
>> >> settings on hw offload failure if required in the future.
>> >>
>> >> With this, it also makes sure a notification goes out only after the
>> >> attributes are set both in the kernel and hw.
>> > I like this approach as it streamlines the steps for the user in
>> > setting port flags.  There is one case for FLOODING where you'll have
>> > to turn off flooding for both, and then turn on flooding in hw.  You
>> > don't want flooding turned on on kernel and hw.
>> ok, maybe using the higher bits as in
>> https://patchwork.ozlabs.org/patch/413211/
>>
>> might help with that. Let me think some more.
>> >
>> >> ---
>> >>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>> >>   1 file changed, 26 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index
>> >> 9f5eb55..ce173f0 100644
>> >> --- a/net/bridge/br_netlink.c
>> >> +++ b/net/bridge/br_netlink.c
>> >> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct
>> nlmsghdr *nlh)
>> >>                                  afspec, RTM_SETLINK);
>> >>          }
>> >>
>> >> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>> >> +                       dev->netdev_ops->ndo_bridge_setlink) {
>> >> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev,
>> >> + nlh);
>> > I think you want to up-level this to net/core/rtnetlink.c because
>> > you're only enabling the feature for one instance of a driver that
>> > implements ndo_bridge_setlink: the bridge driver.  If another driver
>> > was MASTER and implemented ndo_bridge_setlink, you'd want same check
>> > to push setting down to SELF port driver.
>>
>> yeah, i thought about that. But i moved it here so that rollback would be
>> easier.
>
> There is a need for propagating setlink/dellink requests down multiple levels.
> The use-case I have in mind is a bridge at the top, team/bond in the middle, and port devices at the bottom.
> A setlink for VLAN filtering attributes would come with MASTER flag set, and either port or bond/team netdev.
> How would this be handled?
>
> The propagation rules between bridge and enslaved port device could be different from those between bond/team and enslaved devices.
> The current bridge driver does not propagate VLAN filtering from bridge to its ports as each port could have different configuration. In a case of a bond/team all members need to have the same configuration such that the a bond/team would be indistinguishable from a simple port.
>
> Therefore rtnetlink.c might not have the knowledge for propagation across multiple levels.
> It seems that each device which implements ndo_bridge_setlink/ndo_bridge_dellink  and could have master role, need to take care of propagation to its slaves.

Thanks Ronen for bringing up this use-case of stacked masters.  I
think for VLAN filtering, the stacked master case is handled, not by
ndo_setlink/dellink at each level, but with ndo_vlan_rx_kill_vid and
ndo_vlan_rx_add_vid.  So the switch port driver can know VLAN
membership for port even if port is under bond which is under bridge,
by using ndo_vlan_rx_xxx and setting NETIF_F_HW_VLAN_CTAG_FILTER.  The
bonding driver's ndo_vlan_rx_xxx handlers seem to keep ports in bond
VLAN membership consistent across bond.

But in general, ndo_setlink/dellink don't work for the stack use-case
for other non-VLAN attributes.  Maybe the answer is to use the VLAN
propogation model for other attributes.  ndo_setlink/dellink/getlink
have enough weird-isms it might be time to define cleaner ndo ops to
propagate the other attrs down.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roopa Prabhu Dec. 7, 2014, 5:33 p.m. UTC | #19
On 12/6/14, 12:05 AM, Arad, Ronen wrote:
>
>> -----Original Message-----
>> From: Scott Feldman [mailto:sfeldma@gmail.com]
>> Sent: Friday, December 05, 2014 10:29 PM
>> To: Arad, Ronen
>> Cc: Roopa Prabhu; Netdev; Jirí Pírko; Jamal Hadi Salim; Benjamin LaHaise;
>> Thomas Graf; john fastabend; stephen@networkplumber.org; John Linville;
>> nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com; Florian
>> Fainelli; buytenh@wantstofly.org; Aviad Raveh; David S. Miller;
>> shm@cumulusnetworks.com; Andy Gospodarek
>> Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to switch asic
>> if feature flag set
>>
>> On Fri, Dec 5, 2014 at 5:04 PM, Arad, Ronen <ronen.arad@intel.com> wrote:
>>> I have another case of propagation which is not covered by the proposed
>> patch.
>>> A recent patch introduced default_pvid attribute for a bridge (so far
>> supported only via sysfs and not via netlink).
>>> When a port joins a bridge, it inherits a PVID from the default_pvid of the
>> bridge.
>>> The bridge driver propagates that to the newly created net_bridge_port.
>> This is done in br_vlan.c:
>>> int nbp_vlan_init(struct net_bridge_port *p) {
>>>          int rc = 0;
>>>
>>>          if (p->br->default_pvid) {
>>>                  rc = nbp_vlan_add(p, p->br->default_pvid,
>>>                                    BRIDGE_VLAN_INFO_PVID |
>>>                                    BRIDGE_VLAN_INFO_UNTAGGED);
>>>          }
>>>
>>>          return rc;
>>> }
>>>
>>> When L2 switching is offloaded to the HW, this PVID setting need to be
>> propagated.
>>
>> Agreed, it would be nice to have it propagated down, but there is a non-ideal
>> work-around.  If you set default_pvid=0 to turn off PVID, then the switch port
>> driver can pick some internal VLAN ID just for HW purposes in matching
>> untagged pkts.  It's non-ideal because the switch port driver needs to reserve
>> a block of VLAN IDs for internal usage or use some other matching
>> mechanism to keep untagged pkts within this bridge.
> This work-around let the administrator avoid using VID=1 as the default VLAN for untagged frames. However, it does not let the administrator pick a VID of her choice.
>
>> Better to have default_pvid value propagated down.  But, default_pvid is a
>> per-bridge property, not a per-bridge-port property.
>> RTM_SETLINK/RTM_GETLINK for PF_BRIDGE does have AFSPEC for per-bridge
>> and PROTINFO for per-bridge-port, so it seems PVID needs to be part of
>> AFSPEC.
> I believe AFSPEC is not limited to per-bridge properties. It is per-bridge when the netlink msg's ifindex is that of a bridge and SELF flag is set.
> AFSPEC is for a port when the netlink msg's ifindex is that of an enslaved port device and MASTER flag is set (or neither MASTER nor SELF flag is set)
> PVID is one of the flags associated with a VID in bridge_vlan_info.
correct.
> default_pvid is not currently supported by netlink. A new IFLA_BRIDGE_DEFAULT_PVID could be introduced to carry this property when a nlmsg is directed at a bridge.
>
>
correct again. And yes,  a netlink attribute to set default pvid is due.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roopa Prabhu Dec. 7, 2014, 7:13 p.m. UTC | #20
On 12/5/14, 3:21 PM, Arad, Ronen wrote:
>
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org] On Behalf Of Roopa Prabhu
>> Sent: Thursday, December 04, 2014 11:02 PM
>> To: Scott Feldman
>> Cc: Jiří Pírko; Jamal Hadi Salim; Benjamin LaHaise; Thomas Graf; john
>> fastabend; stephen@networkplumber.org; John Linville;
>> nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com; Florian
>> Fainelli; buytenh@wantstofly.org; Aviad Raveh; Netdev; David S. Miller;
>> shm@cumulusnetworks.com; Andy Gospodarek
>> Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to switch asic
>> if feature flag set
>>
>> On 12/4/14, 10:41 PM, Scott Feldman wrote:
>>> On Thu, Dec 4, 2014 at 6:26 PM,  <roopa@cumulusnetworks.com> wrote:
>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>
>>>> This allows offloading to switch asic without having the user to set
>>>> any flag. And this is done in the bridge driver to rollback kernel
>>>> settings on hw offload failure if required in the future.
>>>>
>>>> With this, it also makes sure a notification goes out only after the
>>>> attributes are set both in the kernel and hw.
>>> I like this approach as it streamlines the steps for the user in
>>> setting port flags.  There is one case for FLOODING where you'll have
>>> to turn off flooding for both, and then turn on flooding in hw.  You
>>> don't want flooding turned on on kernel and hw.
>> ok, maybe using the higher bits as in
>> https://patchwork.ozlabs.org/patch/413211/
>>
>> might help with that. Let me think some more.
>>>> ---
>>>>    net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>>>>    1 file changed, 26 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index
>>>> 9f5eb55..ce173f0 100644
>>>> --- a/net/bridge/br_netlink.c
>>>> +++ b/net/bridge/br_netlink.c
>>>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct
>> nlmsghdr *nlh)
>>>>                                   afspec, RTM_SETLINK);
>>>>           }
>>>>
>>>> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>>>> +                       dev->netdev_ops->ndo_bridge_setlink) {
>>>> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev,
>>>> + nlh);
>>> I think you want to up-level this to net/core/rtnetlink.c because
>>> you're only enabling the feature for one instance of a driver that
>>> implements ndo_bridge_setlink: the bridge driver.  If another driver
>>> was MASTER and implemented ndo_bridge_setlink, you'd want same check
>>> to push setting down to SELF port driver.
>> yeah, i thought about that. But i moved it here so that rollback would be
>> easier.
> There is a need for propagating setlink/dellink requests down multiple levels.
> The use-case I have in mind is a bridge at the top, team/bond in the middle, and port devices at the bottom.
> A setlink for VLAN filtering attributes would come with MASTER flag set, and either port or bond/team netdev.
> How would this be handled?

Good point. glad that you brought this up.
>
> The propagation rules between bridge and enslaved port device could be different from those between bond/team and enslaved devices.
> The current bridge driver does not propagate VLAN filtering from bridge to its ports as each port could have different configuration. In a case of a bond/team all members need to have the same configuration such that the a bond/team would be indistinguishable from a simple port.
>
> Therefore rtnetlink.c might not have the knowledge for propagation across multiple levels.
> It seems that each device which implements ndo_bridge_setlink/ndo_bridge_dellink  and could have master role, need to take care of propagation to its slaves.

Are you suggesting all devices in such stack will implement 
ndo_bridge_setlink/ndo_bridge_dellink ?.
If yes,  Then that will not scale in terms of supporting all devices.

I am thinking an ndo op will not help here. On our systems, for such 
stacked devices, the switch driver is aware of all interfaces it cares 
about. In this case since the bond slaves are switch ports, it tracks 
the bond too. When the bond goes into the bridge, it tracks the bond as 
a bridge port.
Which means it uses any information a bridge driver calls on its port 
via ndo_bridge_setlink(). the port can be a bond or any device as long 
as the bridge port leads to a switch port.

As long as the bond driver does not need to do anything in software for 
this when its a bridge port, seems like the bridge driver should notify 
any switch devices that are interested in this setlink attributes (This 
includes the vlan information that you talk about).
To me this calls for switch device ops..... to go to the switch driver 
directly (context for this is my short talk at the BOF at LPC dusseldorf).
These swdev ops could be registered by the switch driver for all devices 
and at the time of the callback the switch driver can decide to service 
it or not, OR it could be registered only on devices the switch driver 
is interested in. (In this case when the bond becomes part of the bridge )

Will think about this some more, and will work on some RFC patches for 
this case.

Thanks,
Roopa

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roopa Prabhu Dec. 7, 2014, 8:24 p.m. UTC | #21
On 12/5/14, 10:54 PM, Scott Feldman wrote:
> On Fri, Dec 5, 2014 at 3:21 PM, Arad, Ronen <ronen.arad@intel.com> wrote:
>>
>>> -----Original Message-----
>>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>>> owner@vger.kernel.org] On Behalf Of Roopa Prabhu
>>> Sent: Thursday, December 04, 2014 11:02 PM
>>> To: Scott Feldman
>>> Cc: Jiří Pírko; Jamal Hadi Salim; Benjamin LaHaise; Thomas Graf; john
>>> fastabend; stephen@networkplumber.org; John Linville;
>>> nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com; Florian
>>> Fainelli; buytenh@wantstofly.org; Aviad Raveh; Netdev; David S. Miller;
>>> shm@cumulusnetworks.com; Andy Gospodarek
>>> Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to switch asic
>>> if feature flag set
>>>
>>> On 12/4/14, 10:41 PM, Scott Feldman wrote:
>>>> On Thu, Dec 4, 2014 at 6:26 PM,  <roopa@cumulusnetworks.com> wrote:
>>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>>
>>>>> This allows offloading to switch asic without having the user to set
>>>>> any flag. And this is done in the bridge driver to rollback kernel
>>>>> settings on hw offload failure if required in the future.
>>>>>
>>>>> With this, it also makes sure a notification goes out only after the
>>>>> attributes are set both in the kernel and hw.
>>>> I like this approach as it streamlines the steps for the user in
>>>> setting port flags.  There is one case for FLOODING where you'll have
>>>> to turn off flooding for both, and then turn on flooding in hw.  You
>>>> don't want flooding turned on on kernel and hw.
>>> ok, maybe using the higher bits as in
>>> https://patchwork.ozlabs.org/patch/413211/
>>>
>>> might help with that. Let me think some more.
>>>>> ---
>>>>>    net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>>>>>    1 file changed, 26 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index
>>>>> 9f5eb55..ce173f0 100644
>>>>> --- a/net/bridge/br_netlink.c
>>>>> +++ b/net/bridge/br_netlink.c
>>>>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct
>>> nlmsghdr *nlh)
>>>>>                                   afspec, RTM_SETLINK);
>>>>>           }
>>>>>
>>>>> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>>>>> +                       dev->netdev_ops->ndo_bridge_setlink) {
>>>>> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev,
>>>>> + nlh);
>>>> I think you want to up-level this to net/core/rtnetlink.c because
>>>> you're only enabling the feature for one instance of a driver that
>>>> implements ndo_bridge_setlink: the bridge driver.  If another driver
>>>> was MASTER and implemented ndo_bridge_setlink, you'd want same check
>>>> to push setting down to SELF port driver.
>>> yeah, i thought about that. But i moved it here so that rollback would be
>>> easier.
>> There is a need for propagating setlink/dellink requests down multiple levels.
>> The use-case I have in mind is a bridge at the top, team/bond in the middle, and port devices at the bottom.
>> A setlink for VLAN filtering attributes would come with MASTER flag set, and either port or bond/team netdev.
>> How would this be handled?
>>
>> The propagation rules between bridge and enslaved port device could be different from those between bond/team and enslaved devices.
>> The current bridge driver does not propagate VLAN filtering from bridge to its ports as each port could have different configuration. In a case of a bond/team all members need to have the same configuration such that the a bond/team would be indistinguishable from a simple port.
>>
>> Therefore rtnetlink.c might not have the knowledge for propagation across multiple levels.
>> It seems that each device which implements ndo_bridge_setlink/ndo_bridge_dellink  and could have master role, need to take care of propagation to its slaves.
> Thanks Ronen for bringing up this use-case of stacked masters.  I
> think for VLAN filtering, the stacked master case is handled, not by
> ndo_setlink/dellink at each level, but with ndo_vlan_rx_kill_vid and
> ndo_vlan_rx_add_vid.  So the switch port driver can know VLAN
> membership for port even if port is under bond which is under bridge,
> by using ndo_vlan_rx_xxx and setting NETIF_F_HW_VLAN_CTAG_FILTER.  The
> bonding driver's ndo_vlan_rx_xxx handlers seem to keep ports in bond
> VLAN membership consistent across bond.
>
> But in general, ndo_setlink/dellink don't work for the stack use-case
> for other non-VLAN attributes.  Maybe the answer is to use the VLAN
> propogation model for other attributes.  ndo_setlink/dellink/getlink
> have enough weird-isms it might be time to define cleaner ndo ops to
> propagate the other attrs down.
And, only the switch asic driver is interested in these attrs. So, seems 
like for these cases, we need to send these attrs to the switchdriver 
directly instead of going through the stack of netdevs ?. see my 
response to ronen's other email.

Thanks,
Roopa

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roopa Prabhu Dec. 8, 2014, 4:56 a.m. UTC | #22
On 12/7/14, 12:24 PM, Roopa Prabhu wrote:
> On 12/5/14, 10:54 PM, Scott Feldman wrote:
>> On Fri, Dec 5, 2014 at 3:21 PM, Arad, Ronen <ronen.arad@intel.com> 
>> wrote:
>>>
>>>> -----Original Message-----
>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>>>> owner@vger.kernel.org] On Behalf Of Roopa Prabhu
>>>> Sent: Thursday, December 04, 2014 11:02 PM
>>>> To: Scott Feldman
>>>> Cc: Jiří Pírko; Jamal Hadi Salim; Benjamin LaHaise; Thomas Graf; john
>>>> fastabend; stephen@networkplumber.org; John Linville;
>>>> nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com; Florian
>>>> Fainelli; buytenh@wantstofly.org; Aviad Raveh; Netdev; David S. 
>>>> Miller;
>>>> shm@cumulusnetworks.com; Andy Gospodarek
>>>> Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to 
>>>> switch asic
>>>> if feature flag set
>>>>
>>>> On 12/4/14, 10:41 PM, Scott Feldman wrote:
>>>>> On Thu, Dec 4, 2014 at 6:26 PM, <roopa@cumulusnetworks.com> wrote:
>>>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>>>
>>>>>> This allows offloading to switch asic without having the user to set
>>>>>> any flag. And this is done in the bridge driver to rollback kernel
>>>>>> settings on hw offload failure if required in the future.
>>>>>>
>>>>>> With this, it also makes sure a notification goes out only after the
>>>>>> attributes are set both in the kernel and hw.
>>>>> I like this approach as it streamlines the steps for the user in
>>>>> setting port flags.  There is one case for FLOODING where you'll have
>>>>> to turn off flooding for both, and then turn on flooding in hw.  You
>>>>> don't want flooding turned on on kernel and hw.
>>>> ok, maybe using the higher bits as in
>>>> https://patchwork.ozlabs.org/patch/413211/
>>>>
>>>> might help with that. Let me think some more.
>>>>>> ---
>>>>>>    net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>>>>>>    1 file changed, 26 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index
>>>>>> 9f5eb55..ce173f0 100644
>>>>>> --- a/net/bridge/br_netlink.c
>>>>>> +++ b/net/bridge/br_netlink.c
>>>>>> @@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct
>>>> nlmsghdr *nlh)
>>>>>> afspec, RTM_SETLINK);
>>>>>>           }
>>>>>>
>>>>>> +       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>>>>>> + dev->netdev_ops->ndo_bridge_setlink) {
>>>>>> +               int ret = dev->netdev_ops->ndo_bridge_setlink(dev,
>>>>>> + nlh);
>>>>> I think you want to up-level this to net/core/rtnetlink.c because
>>>>> you're only enabling the feature for one instance of a driver that
>>>>> implements ndo_bridge_setlink: the bridge driver.  If another driver
>>>>> was MASTER and implemented ndo_bridge_setlink, you'd want same check
>>>>> to push setting down to SELF port driver.
>>>> yeah, i thought about that. But i moved it here so that rollback 
>>>> would be
>>>> easier.
>>> There is a need for propagating setlink/dellink requests down 
>>> multiple levels.
>>> The use-case I have in mind is a bridge at the top, team/bond in the 
>>> middle, and port devices at the bottom.
>>> A setlink for VLAN filtering attributes would come with MASTER flag 
>>> set, and either port or bond/team netdev.
>>> How would this be handled?
>>>
>>> The propagation rules between bridge and enslaved port device could 
>>> be different from those between bond/team and enslaved devices.
>>> The current bridge driver does not propagate VLAN filtering from 
>>> bridge to its ports as each port could have different configuration. 
>>> In a case of a bond/team all members need to have the same 
>>> configuration such that the a bond/team would be indistinguishable 
>>> from a simple port.
>>>
>>> Therefore rtnetlink.c might not have the knowledge for propagation 
>>> across multiple levels.
>>> It seems that each device which implements 
>>> ndo_bridge_setlink/ndo_bridge_dellink  and could have master role, 
>>> need to take care of propagation to its slaves.
>> Thanks Ronen for bringing up this use-case of stacked masters. I
>> think for VLAN filtering, the stacked master case is handled, not by
>> ndo_setlink/dellink at each level, but with ndo_vlan_rx_kill_vid and
>> ndo_vlan_rx_add_vid.  So the switch port driver can know VLAN
>> membership for port even if port is under bond which is under bridge,
>> by using ndo_vlan_rx_xxx and setting NETIF_F_HW_VLAN_CTAG_FILTER.  The
>> bonding driver's ndo_vlan_rx_xxx handlers seem to keep ports in bond
>> VLAN membership consistent across bond.
>>
>> But in general, ndo_setlink/dellink don't work for the stack use-case
>> for other non-VLAN attributes.  Maybe the answer is to use the VLAN
>> propogation model for other attributes. ndo_setlink/dellink/getlink
>> have enough weird-isms it might be time to define cleaner ndo ops to
>> propagate the other attrs down.
> And, only the switch asic driver is interested in these attrs. So, 
> seems like for these cases, we need to send these attrs to the 
> switchdriver directly instead of going through the stack of netdevs ?. 
> see my response to ronen's other email.

As I work on v3 version of the patches, thinking some more, i am going 
to try to use the OFFLOAD feature flag to traverse the stack of netdevs 
(the bond to the slaves) to get to the switch port.  This is only for 
bridge port attrs for now.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko Dec. 8, 2014, 11:14 a.m. UTC | #23
Sun, Dec 07, 2014 at 09:24:19PM CET, roopa@cumulusnetworks.com wrote:
>On 12/5/14, 10:54 PM, Scott Feldman wrote:
>>On Fri, Dec 5, 2014 at 3:21 PM, Arad, Ronen <ronen.arad@intel.com> wrote:
>>>
>>>>-----Original Message-----
>>>>From: netdev-owner@vger.kernel.org [mailto:netdev-
>>>>owner@vger.kernel.org] On Behalf Of Roopa Prabhu
>>>>Sent: Thursday, December 04, 2014 11:02 PM
>>>>To: Scott Feldman
>>>>Cc: Jiří Pírko; Jamal Hadi Salim; Benjamin LaHaise; Thomas Graf; john
>>>>fastabend; stephen@networkplumber.org; John Linville;
>>>>nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com; Florian
>>>>Fainelli; buytenh@wantstofly.org; Aviad Raveh; Netdev; David S. Miller;
>>>>shm@cumulusnetworks.com; Andy Gospodarek
>>>>Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to switch asic
>>>>if feature flag set
>>>>
>>>>On 12/4/14, 10:41 PM, Scott Feldman wrote:
>>>>>On Thu, Dec 4, 2014 at 6:26 PM,  <roopa@cumulusnetworks.com> wrote:
>>>>>>From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>>>
>>>>>>This allows offloading to switch asic without having the user to set
>>>>>>any flag. And this is done in the bridge driver to rollback kernel
>>>>>>settings on hw offload failure if required in the future.
>>>>>>
>>>>>>With this, it also makes sure a notification goes out only after the
>>>>>>attributes are set both in the kernel and hw.
>>>>>I like this approach as it streamlines the steps for the user in
>>>>>setting port flags.  There is one case for FLOODING where you'll have
>>>>>to turn off flooding for both, and then turn on flooding in hw.  You
>>>>>don't want flooding turned on on kernel and hw.
>>>>ok, maybe using the higher bits as in
>>>>https://patchwork.ozlabs.org/patch/413211/
>>>>
>>>>might help with that. Let me think some more.
>>>>>>---
>>>>>>   net/bridge/br_netlink.c |   27 ++++++++++++++++++++++++++-
>>>>>>   1 file changed, 26 insertions(+), 1 deletion(-)
>>>>>>
>>>>>>diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index
>>>>>>9f5eb55..ce173f0 100644
>>>>>>--- a/net/bridge/br_netlink.c
>>>>>>+++ b/net/bridge/br_netlink.c
>>>>>>@@ -407,9 +407,21 @@ int br_setlink(struct net_device *dev, struct
>>>>nlmsghdr *nlh)
>>>>>>                                  afspec, RTM_SETLINK);
>>>>>>          }
>>>>>>
>>>>>>+       if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
>>>>>>+                       dev->netdev_ops->ndo_bridge_setlink) {
>>>>>>+               int ret = dev->netdev_ops->ndo_bridge_setlink(dev,
>>>>>>+ nlh);
>>>>>I think you want to up-level this to net/core/rtnetlink.c because
>>>>>you're only enabling the feature for one instance of a driver that
>>>>>implements ndo_bridge_setlink: the bridge driver.  If another driver
>>>>>was MASTER and implemented ndo_bridge_setlink, you'd want same check
>>>>>to push setting down to SELF port driver.
>>>>yeah, i thought about that. But i moved it here so that rollback would be
>>>>easier.
>>>There is a need for propagating setlink/dellink requests down multiple levels.
>>>The use-case I have in mind is a bridge at the top, team/bond in the middle, and port devices at the bottom.
>>>A setlink for VLAN filtering attributes would come with MASTER flag set, and either port or bond/team netdev.
>>>How would this be handled?
>>>
>>>The propagation rules between bridge and enslaved port device could be different from those between bond/team and enslaved devices.
>>>The current bridge driver does not propagate VLAN filtering from bridge to its ports as each port could have different configuration. In a case of a bond/team all members need to have the same configuration such that the a bond/team would be indistinguishable from a simple port.
>>>
>>>Therefore rtnetlink.c might not have the knowledge for propagation across multiple levels.
>>>It seems that each device which implements ndo_bridge_setlink/ndo_bridge_dellink  and could have master role, need to take care of propagation to its slaves.
>>Thanks Ronen for bringing up this use-case of stacked masters.  I
>>think for VLAN filtering, the stacked master case is handled, not by
>>ndo_setlink/dellink at each level, but with ndo_vlan_rx_kill_vid and
>>ndo_vlan_rx_add_vid.  So the switch port driver can know VLAN
>>membership for port even if port is under bond which is under bridge,
>>by using ndo_vlan_rx_xxx and setting NETIF_F_HW_VLAN_CTAG_FILTER.  The
>>bonding driver's ndo_vlan_rx_xxx handlers seem to keep ports in bond
>>VLAN membership consistent across bond.
>>
>>But in general, ndo_setlink/dellink don't work for the stack use-case
>>for other non-VLAN attributes.  Maybe the answer is to use the VLAN
>>propogation model for other attributes.  ndo_setlink/dellink/getlink
>>have enough weird-isms it might be time to define cleaner ndo ops to
>>propagate the other attrs down.
>And, only the switch asic driver is interested in these attrs. So, seems like
>for these cases, we need to send these attrs to the switchdriver directly
>instead of going through the stack of netdevs ?. see my response to ronen's
>other email.

I think that this should be handled similar to ndo_vlan_rx_add_vid,
ndo_vlan_rx_kill_vid, ndo_change_mtu and others. Master devices like
bridge, bond, team, etc should take care of propagating the calls to
lower devices. It mignt not make sense sometimes so let the masters to
decide.

I think that the feature bit (ethtool flag) should serve only for user
to actually enable or disable the offload. And thinking about that,
maybe the bit checking should be implemented in switch drivers, not in
bridge and friends.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 9f5eb55..ce173f0 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -407,9 +407,21 @@  int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
 				afspec, RTM_SETLINK);
 	}
 
+	if ((dev->features & NETIF_F_HW_SWITCH_OFFLOAD) &&
+			dev->netdev_ops->ndo_bridge_setlink) {
+		int ret = dev->netdev_ops->ndo_bridge_setlink(dev, nlh);
+		if (ret && ret != -EOPNOTSUPP) {
+			/* XXX Fix this in the future to rollback
+			 * kernel settings and return error
+			 */
+			br_warn(p->br, "error offloading bridge attributes " 
+					"on port %u(%s)\n", (unsigned int) p->port_no,
+					p->dev->name);
+		}
+	}
+
 	if (err == 0)
 		br_ifinfo_notify(RTM_NEWLINK, p);
-
 out:
 	return err;
 }
@@ -433,6 +445,19 @@  int br_dellink(struct net_device *dev, struct nlmsghdr *nlh)
 	err = br_afspec((struct net_bridge *)netdev_priv(dev), p,
 			afspec, RTM_DELLINK);
 
+	if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD
+			&& dev->netdev_ops->ndo_bridge_setlink) {
+		int ret = dev->netdev_ops->ndo_bridge_dellink(dev, nlh);
+		if (ret && ret != -EOPNOTSUPP) {
+			/* XXX Fix this in the future to rollback
+			 * kernel settings and return error
+			 */
+			br_warn(p->br, "error offloading bridge attributes " 
+					"on port %u(%s)\n", (unsigned int) p->port_no,
+					p->dev->name);
+		}
+	}
+
 	return err;
 }
 static int br_validate(struct nlattr *tb[], struct nlattr *data[])