diff mbox

team: add support to get speed via ethtool

Message ID 1425592115-1750-1-git-send-email-sridhar.samudrala@intel.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Samudrala, Sridhar March 5, 2015, 9:48 p.m. UTC
With this patch ethtool <team> OR cat /sys/class/net/<team>/speed
returns the speed of team based on member ports speed and state.

Based on get speed support in bonding driver.

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
---
 drivers/net/team/team.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

Comments

David Miller March 6, 2015, 5:24 a.m. UTC | #1
From: Sridhar Samudrala <sridhar.samudrala@intel.com>
Date: Thu,  5 Mar 2015 13:48:35 -0800

> +	list_for_each_entry(port, &team->port_list, list) {
> +		if (port->linkup)
> +			speed += port->state.speed;
> +		if (ecmd->duplex == DUPLEX_UNKNOWN &&
> +		    port->state.duplex != 0)
> +			ecmd->duplex = port->state.duplex;

This makes no freakin' sense at all.  Adding the speeds together and
returning that?  Are you kidding me?  Reporting only one of the
duplex settings?  Are you kidding me?

Repeat after me: Speed and duplex has no meaning on software devices

Especially for software devices which aggregate links.

If the user wants the speed in a format that is actually useful, he
has to actually know what the geography of the bond or team slaves,
and since he knows that he can probe the individual hardware devices
for speed and duplex information.

I'm not applying anything like this.

There appears to be some mania afoot about trying to return ethtool
speed/duplex settings on software layering and tunneling device,
can someone please cure this illness before I see more patches like
this one?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko March 6, 2015, 7:16 a.m. UTC | #2
Thu, Mar 05, 2015 at 10:48:35PM CET, sridhar.samudrala@intel.com wrote:
>With this patch ethtool <team> OR cat /sys/class/net/<team>/speed
>returns the speed of team based on member ports speed and state.
>
>Based on get speed support in bonding driver.
>
>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>---
> drivers/net/team/team.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
>diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
>index 9d3366f..e822803 100644
>--- a/drivers/net/team/team.c
>+++ b/drivers/net/team/team.c
>@@ -1954,6 +1954,30 @@ static int team_change_carrier(struct net_device *dev, bool new_carrier)
> 	return 0;
> }
> 
>+static int team_ethtool_get_settings(struct net_device *dev,
>+				     struct ethtool_cmd *ecmd)
>+{
>+	struct team *team = netdev_priv(dev);
>+	struct team_port *port;
>+	unsigned long speed = 0;
>+
>+	ecmd->duplex = DUPLEX_UNKNOWN;
>+	ecmd->port = PORT_OTHER;
>+
>+	mutex_lock(&team->lock);
>+	list_for_each_entry(port, &team->port_list, list) {
>+		if (port->linkup)
>+			speed += port->state.speed;
>+		if (ecmd->duplex == DUPLEX_UNKNOWN &&
>+		    port->state.duplex != 0)
>+			ecmd->duplex = port->state.duplex;
>+	}
>+	ethtool_cmd_speed_set(ecmd, speed);
>+	mutex_unlock(&team->lock);
>+
>+	return 0;
>+}

Sridar, what exactly you are trying to achieve? I agree with DaveM that
this make no sense for soft devices. The fact bonding has it is a
mistake.


>+
> static const struct net_device_ops team_netdev_ops = {
> 	.ndo_init		= team_init,
> 	.ndo_uninit		= team_uninit,
>@@ -1995,6 +2019,7 @@ static void team_ethtool_get_drvinfo(struct net_device *dev,
> static const struct ethtool_ops team_ethtool_ops = {
> 	.get_drvinfo		= team_ethtool_get_drvinfo,
> 	.get_link		= ethtool_op_get_link,
>+	.get_settings		= team_ethtool_get_settings,
> };
> 
> /***********************
>-- 
>1.8.4.2
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Samudrala, Sridhar March 6, 2015, 7:30 p.m. UTC | #3
On 3/5/2015 11:16 PM, Jiri Pirko wrote:
> Thu, Mar 05, 2015 at 10:48:35PM CET, sridhar.samudrala@intel.com wrote:
>> With this patch ethtool <team> OR cat /sys/class/net/<team>/speed
>> returns the speed of team based on member ports speed and state.
>>
>> Based on get speed support in bonding driver.
>>
>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> ---
>> drivers/net/team/team.c | 25 +++++++++++++++++++++++++
>> 1 file changed, 25 insertions(+)
>>
>> diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
>> index 9d3366f..e822803 100644
>> --- a/drivers/net/team/team.c
>> +++ b/drivers/net/team/team.c
>> @@ -1954,6 +1954,30 @@ static int team_change_carrier(struct net_device *dev, bool new_carrier)
>> 	return 0;
>> }
>>
>> +static int team_ethtool_get_settings(struct net_device *dev,
>> +				     struct ethtool_cmd *ecmd)
>> +{
>> +	struct team *team = netdev_priv(dev);
>> +	struct team_port *port;
>> +	unsigned long speed = 0;
>> +
>> +	ecmd->duplex = DUPLEX_UNKNOWN;
>> +	ecmd->port = PORT_OTHER;
>> +
>> +	mutex_lock(&team->lock);
>> +	list_for_each_entry(port, &team->port_list, list) {
>> +		if (port->linkup)
>> +			speed += port->state.speed;
>> +		if (ecmd->duplex == DUPLEX_UNKNOWN &&
>> +		    port->state.duplex != 0)
>> +			ecmd->duplex = port->state.duplex;
>> +	}
>> +	ethtool_cmd_speed_set(ecmd, speed);
>> +	mutex_unlock(&team->lock);
>> +
>> +	return 0;
>> +}
> Sridar, what exactly you are trying to achieve? I agree with DaveM that
> this make no sense for soft devices. The fact bonding has it is a
> mistake.
>
We are currently looking into the possibility of using team as a way to
offload link aggregation support to switch hardware.
To support LAG, a team device is created and the switch ports are added
as members of the team. We are considering if we should create a new team
mode specifically to support offload or the existing modes can be extended
to enable offloading. Will appreciate any thoughts you have on this?

For the specific usecase of a team where all the member ports correspond
to offloaded switch ports, does it make sense to support getting speed OR
we should just leave it to the mgmt apps to figure out the speed based on
the speed of the member ports?

Thanks
Sridhar


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Gospodarek March 7, 2015, 2:42 a.m. UTC | #4
On Fri, Mar 06, 2015 at 11:30:53AM -0800, Samudrala, Sridhar wrote:
> 
> On 3/5/2015 11:16 PM, Jiri Pirko wrote:
> >Thu, Mar 05, 2015 at 10:48:35PM CET, sridhar.samudrala@intel.com wrote:
> >>With this patch ethtool <team> OR cat /sys/class/net/<team>/speed
> >>returns the speed of team based on member ports speed and state.
> >>
> >>Based on get speed support in bonding driver.
> >>
> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> >>---
> >>drivers/net/team/team.c | 25 +++++++++++++++++++++++++
> >>1 file changed, 25 insertions(+)
> >>
> >>diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
> >>index 9d3366f..e822803 100644
> >>--- a/drivers/net/team/team.c
> >>+++ b/drivers/net/team/team.c
> >>@@ -1954,6 +1954,30 @@ static int team_change_carrier(struct net_device *dev, bool new_carrier)
> >>	return 0;
> >>}
> >>
> >>+static int team_ethtool_get_settings(struct net_device *dev,
> >>+				     struct ethtool_cmd *ecmd)
> >>+{
> >>+	struct team *team = netdev_priv(dev);
> >>+	struct team_port *port;
> >>+	unsigned long speed = 0;
> >>+
> >>+	ecmd->duplex = DUPLEX_UNKNOWN;
> >>+	ecmd->port = PORT_OTHER;
> >>+
> >>+	mutex_lock(&team->lock);
> >>+	list_for_each_entry(port, &team->port_list, list) {
> >>+		if (port->linkup)
> >>+			speed += port->state.speed;
> >>+		if (ecmd->duplex == DUPLEX_UNKNOWN &&
> >>+		    port->state.duplex != 0)
> >>+			ecmd->duplex = port->state.duplex;
> >>+	}
> >>+	ethtool_cmd_speed_set(ecmd, speed);
> >>+	mutex_unlock(&team->lock);
> >>+
> >>+	return 0;
> >>+}
> >Sridar, what exactly you are trying to achieve? I agree with DaveM that
> >this make no sense for soft devices. The fact bonding has it is a
> >mistake.
> >
> We are currently looking into the possibility of using team as a way to
> offload link aggregation support to switch hardware.
> To support LAG, a team device is created and the switch ports are added
> as members of the team. We are considering if we should create a new team
> mode specifically to support offload or the existing modes can be extended
> to enable offloading. Will appreciate any thoughts you have on this?

There is no real need to create an offload mode for teaming or bonding.
There is actually a similar thread happening right now to discuss how to
offload bonding to hardware, I would suggest browsing that and
discussing in that thread[1].

FWIW, we offload bonding right now without much issue by monitoring
netlink events.  All the hardware really needs to know is the membership
information and the hash algorithm the user might like to set and both
of those are available on recent kernels via netlink.  Based on the
hardware we are using and the in-kernel infrastructure that existed at
the time it was the proper choice.

1. [PATCH net-next 0/2] dsa: implement HW bonding
http://patchwork.ozlabs.org/patch/441907/


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Gospodarek March 7, 2015, 3:10 a.m. UTC | #5
On Fri, Mar 06, 2015 at 12:24:10AM -0500, David Miller wrote:
> From: Sridhar Samudrala <sridhar.samudrala@intel.com>
> Date: Thu,  5 Mar 2015 13:48:35 -0800
> 
> > +	list_for_each_entry(port, &team->port_list, list) {
> > +		if (port->linkup)
> > +			speed += port->state.speed;
> > +		if (ecmd->duplex == DUPLEX_UNKNOWN &&
> > +		    port->state.duplex != 0)
> > +			ecmd->duplex = port->state.duplex;
> 
> This makes no freakin' sense at all.  Adding the speeds together and
> returning that?  Are you kidding me?  Reporting only one of the
> duplex settings?  Are you kidding me?
> 
> Repeat after me: Speed and duplex has no meaning on software devices
> 
> Especially for software devices which aggregate links.
> 
> If the user wants the speed in a format that is actually useful, he
> has to actually know what the geography of the bond or team slaves,
> and since he knows that he can probe the individual hardware devices
> for speed and duplex information.

IIRC the value of a patch like this (where the underlying device is
backed by real hardware) is for remote monitoring tools.  I completely
agree that local users have no problem determining link utilization or
max bandwidth available by simply adding up all the members of the bond,
but this was not possible for SNMP-based tools that are not aware of the
aggregation of ports on the host.

> I'm not applying anything like this.
> 
> There appears to be some mania afoot about trying to return ethtool
> speed/duplex settings on software layering and tunneling device,
> can someone please cure this illness before I see more patches like
> this one?

You have made it clear that despite the value others see in it, you are
opposed to setting the speed and duplex on things like tuntap and I
agree with you on this.  Doing that does nothing but continue to enable
offload hardware to live in userspace and that is not the proper
direction the kernel should take.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 7, 2015, 6:12 a.m. UTC | #6
From: Andy Gospodarek <gospo@cumulusnetworks.com>
Date: Fri, 6 Mar 2015 22:10:13 -0500

> but this was not possible for SNMP-based tools that are not aware of
> the aggregation of ports on the host.

The suggested value to report HAS NO MEANING.

If you just add the speeds up that's complete bullshit.

Nothing prevents the SNMP userland from doing the right thing
and figuring out the geography and perhaps even exporting
that geography to SNMP querying agents.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 7, 2015, 6:14 a.m. UTC | #7
From: Andy Gospodarek <gospo@cumulusnetworks.com>
Date: Fri, 6 Mar 2015 22:10:13 -0500

> You have made it clear that despite the value others see in it, you are
> opposed to setting the speed and duplex on things like tuntap and I
> agree with you on this.  Doing that does nothing but continue to enable
> offload hardware to live in userspace and that is not the proper
> direction the kernel should take.

This is a scarecrow.

HW offloading has nothing at all to do with reporting complete
bullshit link speed and duplex values on team/bonding devices.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Gospodarek March 7, 2015, 12:30 p.m. UTC | #8
On Sat, Mar 07, 2015 at 01:12:49AM -0500, David Miller wrote:
> From: Andy Gospodarek <gospo@cumulusnetworks.com>
> Date: Fri, 6 Mar 2015 22:10:13 -0500
> 
> > but this was not possible for SNMP-based tools that are not aware of
> > the aggregation of ports on the host.
> 
> The suggested value to report HAS NO MEANING.
> 
> If you just add the speeds up that's complete bullshit.
> 
> Nothing prevents the SNMP userland from doing the right thing
> and figuring out the geography and perhaps even exporting
> that geography to SNMP querying agents.

I was actually pretty surprised that this patch stuck when I first
posted it for that very reason.  I'm quite sure I posted this as a RHEL
customer put in a request for the feature, but since I'm clearly less
invested today in that subset of Linux users than I was when I posted
that patch it doesn't bother me one bit if it is pulled!

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Gospodarek March 7, 2015, 12:47 p.m. UTC | #9
On Sat, Mar 07, 2015 at 01:14:43AM -0500, David Miller wrote:
> From: Andy Gospodarek <gospo@cumulusnetworks.com>
> Date: Fri, 6 Mar 2015 22:10:13 -0500
> 
> > You have made it clear that despite the value others see in it, you are
> > opposed to setting the speed and duplex on things like tuntap and I
> > agree with you on this.  Doing that does nothing but continue to enable
> > offload hardware to live in userspace and that is not the proper
> > direction the kernel should take.
> 
> This is a scarecrow.
> 
> HW offloading has nothing at all to do with reporting complete
> bullshit link speed and duplex values on team/bonding devices.

I guess I read too much into the comments in this thread and merged them
with the ones on the changing/setting of tuntap speeds.  I'll circle
back with Stephen and Zhu to see if the want to resubmit something that
satisfies everyone's needs.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko March 7, 2015, 1:57 p.m. UTC | #10
Fri, Mar 06, 2015 at 08:30:53PM CET, sridhar.samudrala@intel.com wrote:
>
>On 3/5/2015 11:16 PM, Jiri Pirko wrote:
>>Thu, Mar 05, 2015 at 10:48:35PM CET, sridhar.samudrala@intel.com wrote:
>>>With this patch ethtool <team> OR cat /sys/class/net/<team>/speed
>>>returns the speed of team based on member ports speed and state.
>>>
>>>Based on get speed support in bonding driver.
>>>
>>>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>>>---
>>>drivers/net/team/team.c | 25 +++++++++++++++++++++++++
>>>1 file changed, 25 insertions(+)
>>>
>>>diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
>>>index 9d3366f..e822803 100644
>>>--- a/drivers/net/team/team.c
>>>+++ b/drivers/net/team/team.c
>>>@@ -1954,6 +1954,30 @@ static int team_change_carrier(struct net_device *dev, bool new_carrier)
>>>	return 0;
>>>}
>>>
>>>+static int team_ethtool_get_settings(struct net_device *dev,
>>>+				     struct ethtool_cmd *ecmd)
>>>+{
>>>+	struct team *team = netdev_priv(dev);
>>>+	struct team_port *port;
>>>+	unsigned long speed = 0;
>>>+
>>>+	ecmd->duplex = DUPLEX_UNKNOWN;
>>>+	ecmd->port = PORT_OTHER;
>>>+
>>>+	mutex_lock(&team->lock);
>>>+	list_for_each_entry(port, &team->port_list, list) {
>>>+		if (port->linkup)
>>>+			speed += port->state.speed;
>>>+		if (ecmd->duplex == DUPLEX_UNKNOWN &&
>>>+		    port->state.duplex != 0)
>>>+			ecmd->duplex = port->state.duplex;
>>>+	}
>>>+	ethtool_cmd_speed_set(ecmd, speed);
>>>+	mutex_unlock(&team->lock);
>>>+
>>>+	return 0;
>>>+}
>>Sridar, what exactly you are trying to achieve? I agree with DaveM that
>>this make no sense for soft devices. The fact bonding has it is a
>>mistake.
>>
>We are currently looking into the possibility of using team as a way to
>offload link aggregation support to switch hardware.
>To support LAG, a team device is created and the switch ports are added
>as members of the team. We are considering if we should create a new team
>mode specifically to support offload or the existing modes can be extended
>to enable offloading. Will appreciate any thoughts you have on this?

No, please no special modes. That is unacceptable. Just use what exists,
extend it if needed, then offload it. That is the way to go.


>
>For the specific usecase of a team where all the member ports correspond
>to offloaded switch ports, does it make sense to support getting speed OR
>we should just leave it to the mgmt apps to figure out the speed based on
>the speed of the member ports?


I do not see any reason for it. mgmt app can find out how devices are
stacked and can query the port devices directly. No need to pretend
something on team layer for it.


Jiri

>
>Thanks
>Sridhar
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 9d3366f..e822803 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1954,6 +1954,30 @@  static int team_change_carrier(struct net_device *dev, bool new_carrier)
 	return 0;
 }
 
+static int team_ethtool_get_settings(struct net_device *dev,
+				     struct ethtool_cmd *ecmd)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+	unsigned long speed = 0;
+
+	ecmd->duplex = DUPLEX_UNKNOWN;
+	ecmd->port = PORT_OTHER;
+
+	mutex_lock(&team->lock);
+	list_for_each_entry(port, &team->port_list, list) {
+		if (port->linkup)
+			speed += port->state.speed;
+		if (ecmd->duplex == DUPLEX_UNKNOWN &&
+		    port->state.duplex != 0)
+			ecmd->duplex = port->state.duplex;
+	}
+	ethtool_cmd_speed_set(ecmd, speed);
+	mutex_unlock(&team->lock);
+
+	return 0;
+}
+
 static const struct net_device_ops team_netdev_ops = {
 	.ndo_init		= team_init,
 	.ndo_uninit		= team_uninit,
@@ -1995,6 +2019,7 @@  static void team_ethtool_get_drvinfo(struct net_device *dev,
 static const struct ethtool_ops team_ethtool_ops = {
 	.get_drvinfo		= team_ethtool_get_drvinfo,
 	.get_link		= ethtool_op_get_link,
+	.get_settings		= team_ethtool_get_settings,
 };
 
 /***********************