diff mbox

net-sysfs: Report link speed only when possible

Message ID 771f792d4d4391c656305a522557c53c4e84a666.1402043894.git.mprivozn@redhat.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Michal Prívozník June 6, 2014, 8:40 a.m. UTC
The link speed is available at /sys/class/net/$nic/speed.
However, in some cases, depending on the driver, if the link is
not plugged, -1 is reported (this is the case of e1000e for
instance). To make things worse, the value is printed out as an
unsigned integer, so you'll get this shady number which you can't
evaluate correctly. This is actually a regression in 3.X kernels
as the commit that broke things is 8ae6daca. With this change,
you'll get -EINVAL whenever a -1 is to be printed out.

Before the change:
  # cat /sys/class/net/eth0/speed
  4294967295

After the change:
  # cat /sys/class/net/eth0/speed
  cat: /sys/class/net/eth0/speed: Invalid argument

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
 net/core/net-sysfs.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Jiri Pirko June 6, 2014, 8:57 a.m. UTC | #1
Fri, Jun 06, 2014 at 10:40:30AM CEST, mprivozn@redhat.com wrote:
>The link speed is available at /sys/class/net/$nic/speed.
>However, in some cases, depending on the driver, if the link is
>not plugged, -1 is reported (this is the case of e1000e for
>instance). To make things worse, the value is printed out as an

Actually, SPEED_UNKNOWN is also -1
So e1000e is not any exception.


>unsigned integer, so you'll get this shady number which you can't
>evaluate correctly. This is actually a regression in 3.X kernels
>as the commit that broke things is 8ae6daca. With this change,
>you'll get -EINVAL whenever a -1 is to be printed out.
>
>Before the change:
>  # cat /sys/class/net/eth0/speed
>  4294967295
>
>After the change:
>  # cat /sys/class/net/eth0/speed
>  cat: /sys/class/net/eth0/speed: Invalid argument
>
>Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>---
> net/core/net-sysfs.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
>diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>index 1cac29e..ce4b298 100644
>--- a/net/core/net-sysfs.c
>+++ b/net/core/net-sysfs.c
>@@ -172,8 +172,13 @@ static ssize_t speed_show(struct device *dev,
> 
> 	if (netif_running(netdev)) {
> 		struct ethtool_cmd cmd;
>-		if (!__ethtool_get_settings(netdev, &cmd))
>-			ret = sprintf(buf, fmt_udec, ethtool_cmd_speed(&cmd));
>+		if (!__ethtool_get_settings(netdev, &cmd)) {
>+			__u32 speed = ethtool_cmd_speed(&cmd);
>+
>+			if (speed != (__u32) -1)

Just rather use SPEED_UNKNOWN instead of -1 here.


>+				ret = sprintf(buf, fmt_udec,
>+					      ethtool_cmd_speed(&cmd));
>+		}
> 	}
> 	rtnl_unlock();
> 	return ret;
>-- 
>2.0.0
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller June 6, 2014, 7:54 p.m. UTC | #2
From: Jiri Pirko <jiri@resnulli.us>
Date: Fri, 6 Jun 2014 10:57:33 +0200

> Fri, Jun 06, 2014 at 10:40:30AM CEST, mprivozn@redhat.com wrote:
>>The link speed is available at /sys/class/net/$nic/speed.
>>However, in some cases, depending on the driver, if the link is
>>not plugged, -1 is reported (this is the case of e1000e for
>>instance). To make things worse, the value is printed out as an
> 
> Actually, SPEED_UNKNOWN is also -1
> So e1000e is not any exception.

And pity the person who is handling this by evaluating that unsigned
value, we'll break them.

We can't keep changing behavior for the SPEED_UNKOWN case back and
forth.

A program that wants to work with all kernels now has to handle three
different kinds of behavior if we apply this patch, that's not making
things better, it's making things worse.

I'm not applying a patch that does this, sorry.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michal Prívozník June 13, 2014, 9:19 a.m. UTC | #3
On 06.06.2014 21:54, David Miller wrote:
> From: Jiri Pirko <jiri@resnulli.us>
> Date: Fri, 6 Jun 2014 10:57:33 +0200
>
>> Fri, Jun 06, 2014 at 10:40:30AM CEST, mprivozn@redhat.com wrote:
>>> The link speed is available at /sys/class/net/$nic/speed.
>>> However, in some cases, depending on the driver, if the link is
>>> not plugged, -1 is reported (this is the case of e1000e for
>>> instance). To make things worse, the value is printed out as an
>>
>> Actually, SPEED_UNKNOWN is also -1
>> So e1000e is not any exception.
>
> And pity the person who is handling this by evaluating that unsigned
> value, we'll break them.
>
> We can't keep changing behavior for the SPEED_UNKOWN case back and
> forth.
>
> A program that wants to work with all kernels now has to handle three
> different kinds of behavior if we apply this patch, that's not making
> things better, it's making things worse.

I don't think this is a good reason to keep things broken. By the same 
token we wouldn't need to fix any bug since there already might be an 
application that learned how to deal with it.

This patch is not introducing any new class that applications need to 
learn. It just barely removes the spurious one. And it's perfectly okay 
in some ecosystems to require minimal version of some programs, 
including kernel. So if I were developing brand new application I could 
say: I'm dropping all this workaround code and have it clean and require 
say 3.16 kernel at least. Moreover, using an older application would do, 
as the workaround code will never be executed.

But okay, maybe for some reason you're hesitant to change net-sysfs.c. 
Would it be more convenient to change the broken drivers instead? Yes, 
some drivers don't report nor 0, nor -1, but rather random value. For 
instance a tap device returns '10',  igb '65535', etc.

Michal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller June 13, 2014, 8:03 p.m. UTC | #4
From: Michal Privoznik <mprivozn@redhat.com>
Date: Fri, 13 Jun 2014 11:19:51 +0200

> So if I were developing brand new application I could say: I'm
> dropping all this workaround code and have it clean and require say
> 3.16 kernel at least.

Then your application wouldn't be usable on %99 of systems for a long
long time.

I don't think this is the right tradeoff at all.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michal Prívozník June 16, 2014, 7:32 a.m. UTC | #5
On 13.06.2014 22:03, David Miller wrote:
> From: Michal Privoznik <mprivozn@redhat.com>
> Date: Fri, 13 Jun 2014 11:19:51 +0200
>
>> So if I were developing brand new application I could say: I'm
>> dropping all this workaround code and have it clean and require say
>> 3.16 kernel at least.
>
> Then your application wouldn't be usable on %99 of systems for a long
> long time.
>

How come? The application is going to be usable for as long as 
library/kernel APIs won't change. Or until the time a new regression is 
introduced and fix is rejected. Speaking of which - long long time 
applications *are* broken now. This patch is combining the good from 
both worlds: old applications are fixed, new applications doesn't have 
to learn anything new.

> I don't think this is the right tradeoff at all.
>

Neither is keeping things broken.

Michal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller June 16, 2014, 8:11 a.m. UTC | #6
From: Michal Privoznik <mprivozn@redhat.com>
Date: Mon, 16 Jun 2014 09:32:35 +0200

> On 13.06.2014 22:03, David Miller wrote:
>> From: Michal Privoznik <mprivozn@redhat.com>
>> Date: Fri, 13 Jun 2014 11:19:51 +0200
>>
>>> So if I were developing brand new application I could say: I'm
>>> dropping all this workaround code and have it clean and require say
>>> 3.16 kernel at least.
>>
>> Then your application wouldn't be usable on %99 of systems for a long
>> long time.
>>
> 
> How come? The application is going to be usable for as long as
> library/kernel APIs won't change.

Because %99 of users are using a distribution kernel which is definitely
going to be pre-3.16 for years.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michal Prívozník June 16, 2014, 8:30 a.m. UTC | #7
On 16.06.2014 10:11, David Miller wrote:
> From: Michal Privoznik <mprivozn@redhat.com>
> Date: Mon, 16 Jun 2014 09:32:35 +0200
>
>> On 13.06.2014 22:03, David Miller wrote:
>>> From: Michal Privoznik <mprivozn@redhat.com>
>>> Date: Fri, 13 Jun 2014 11:19:51 +0200
>>>
>>>> So if I were developing brand new application I could say: I'm
>>>> dropping all this workaround code and have it clean and require say
>>>> 3.16 kernel at least.
>>>
>>> Then your application wouldn't be usable on %99 of systems for a long
>>> long time.
>>>
>>
>> How come? The application is going to be usable for as long as
>> library/kernel APIs won't change.
>
> Because %99 of users are using a distribution kernel which is definitely
> going to be pre-3.16 for years.
>

That's why every distribution out there has a mechanism to install 
packages of a certain version, or those providing certain symbol, 
whatever. Or distributions can then backport some kernel patches or 
something. But, that's completely unrelated to the problem I'm fixing 
here. I don't think this bikeshedding is useful for anything, sorry.

Michal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller June 16, 2014, 8:44 a.m. UTC | #8
From: Michal Privoznik <mprivozn@redhat.com>
Date: Mon, 16 Jun 2014 10:30:27 +0200

> On 16.06.2014 10:11, David Miller wrote:
>> From: Michal Privoznik <mprivozn@redhat.com>
>> Date: Mon, 16 Jun 2014 09:32:35 +0200
>>
>>> On 13.06.2014 22:03, David Miller wrote:
>>>> From: Michal Privoznik <mprivozn@redhat.com>
>>>> Date: Fri, 13 Jun 2014 11:19:51 +0200
>>>>
>>>>> So if I were developing brand new application I could say: I'm
>>>>> dropping all this workaround code and have it clean and require say
>>>>> 3.16 kernel at least.
>>>>
>>>> Then your application wouldn't be usable on %99 of systems for a long
>>>> long time.
>>>>
>>>
>>> How come? The application is going to be usable for as long as
>>> library/kernel APIs won't change.
>>
>> Because %99 of users are using a distribution kernel which is
>> definitely
>> going to be pre-3.16 for years.
>>
> 
> That's why every distribution out there has a mechanism to install
> packages of a certain version, or those providing certain symbol,
> whatever. Or distributions can then backport some kernel patches or
> something. But, that's completely unrelated to the problem I'm fixing
> here. I don't think this bikeshedding is useful for anything, sorry.

You're being entirely impractical.

By restricting an application to a kernel version or behavior "via
backported patches" which doesn't even exist yet, you are foolishly
restricting your userbase.

People just cope with what the current kernels support, when possible,
and that's the right thing to do because we cannot break it on them
exactly because people can depend upon the behavior.

NOBODY is checking for -EINVAL returns when reading the link speed
sysfs file, and therefore by signalling it you will break
applications.

So I will not apply a patch which adds that new behavior, sorry.

I am not willing to discuss this further, this is fundamental and
simple as far as I'm concerned.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michal Prívozník June 16, 2014, 8:59 a.m. UTC | #9
On 16.06.2014 10:44, David Miller wrote:
> From: Michal Privoznik <mprivozn@redhat.com>
> Date: Mon, 16 Jun 2014 10:30:27 +0200
>
>> On 16.06.2014 10:11, David Miller wrote:
>>> From: Michal Privoznik <mprivozn@redhat.com>
>>> Date: Mon, 16 Jun 2014 09:32:35 +0200
>>>
>>>> On 13.06.2014 22:03, David Miller wrote:
>>>>> From: Michal Privoznik <mprivozn@redhat.com>
>>>>> Date: Fri, 13 Jun 2014 11:19:51 +0200
>>>>>
>>>>>> So if I were developing brand new application I could say: I'm
>>>>>> dropping all this workaround code and have it clean and require say
>>>>>> 3.16 kernel at least.
>>>>>
>>>>> Then your application wouldn't be usable on %99 of systems for a long
>>>>> long time.
>>>>>
>>>>
>>>> How come? The application is going to be usable for as long as
>>>> library/kernel APIs won't change.
>>>
>>> Because %99 of users are using a distribution kernel which is
>>> definitely
>>> going to be pre-3.16 for years.
>>>
>>
>> That's why every distribution out there has a mechanism to install
>> packages of a certain version, or those providing certain symbol,
>> whatever. Or distributions can then backport some kernel patches or
>> something. But, that's completely unrelated to the problem I'm fixing
>> here. I don't think this bikeshedding is useful for anything, sorry.
>
> You're being entirely impractical.
>
> By restricting an application to a kernel version or behavior "via
> backported patches" which doesn't even exist yet, you are foolishly
> restricting your userbase.

So? Users still have choice of not using my application. I'm okay with that.

>
> People just cope with what the current kernels support, when possible,
> and that's the right thing to do because we cannot break it on them
> exactly because people can depend upon the behavior.

Once again, we are not breaking anything. Current applications continue 
to work. I don't understand why you keep writing the opposite over and 
over again.

>
> NOBODY is checking for -EINVAL returns when reading the link speed
> sysfs file, and therefore by signalling it you will break
> applications.

That's very interesting thing to say, since even now one can experience 
EINVAL:

# cat /sys/class/net/wlan0/speed
cat: /sys/class/net/wlan0/speed: Invalid argument

How do you know for sure that NOBODY is checking -EINVAL?
For example libvirt does check EINVAL:

http://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/virnetdev.c;h=a551f9820b97aac41bbcb19c84d102c0ec3bd0aa;hb=HEAD#l1891

How can a kernel developer state that NOBODY isn't using possible kernel 
API anyway?

>
> So I will not apply a patch which adds that new behavior, sorry.

That's okay.

>
> I am not willing to discuss this further, this is fundamental and
> simple as far as I'm concerned.
>

Sure it is. That's why I'm surprised we even need to have this discussion.

Michal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko June 16, 2014, 9:01 a.m. UTC | #10
Mon, Jun 16, 2014 at 10:44:30AM CEST, davem@davemloft.net wrote:
>From: Michal Privoznik <mprivozn@redhat.com>
>Date: Mon, 16 Jun 2014 10:30:27 +0200
>
>> On 16.06.2014 10:11, David Miller wrote:
>>> From: Michal Privoznik <mprivozn@redhat.com>
>>> Date: Mon, 16 Jun 2014 09:32:35 +0200
>>>
>>>> On 13.06.2014 22:03, David Miller wrote:
>>>>> From: Michal Privoznik <mprivozn@redhat.com>
>>>>> Date: Fri, 13 Jun 2014 11:19:51 +0200
>>>>>
>>>>>> So if I were developing brand new application I could say: I'm
>>>>>> dropping all this workaround code and have it clean and require say
>>>>>> 3.16 kernel at least.
>>>>>
>>>>> Then your application wouldn't be usable on %99 of systems for a long
>>>>> long time.
>>>>>
>>>>
>>>> How come? The application is going to be usable for as long as
>>>> library/kernel APIs won't change.
>>>
>>> Because %99 of users are using a distribution kernel which is
>>> definitely
>>> going to be pre-3.16 for years.
>>>
>> 
>> That's why every distribution out there has a mechanism to install
>> packages of a certain version, or those providing certain symbol,
>> whatever. Or distributions can then backport some kernel patches or
>> something. But, that's completely unrelated to the problem I'm fixing
>> here. I don't think this bikeshedding is useful for anything, sorry.
>
>You're being entirely impractical.
>
>By restricting an application to a kernel version or behavior "via
>backported patches" which doesn't even exist yet, you are foolishly
>restricting your userbase.
>
>People just cope with what the current kernels support, when possible,
>and that's the right thing to do because we cannot break it on them
>exactly because people can depend upon the behavior.
>
>NOBODY is checking for -EINVAL returns when reading the link speed
>sysfs file, and therefore by signalling it you will break
>applications.
>
>So I will not apply a patch which adds that new behavior, sorry.
>
>I am not willing to discuss this further, this is fundamental and
>simple as far as I'm concerned.
>

Let's just hope we do not introduce some other, more serious bug
somewhere else in user api. I see that such things are unfixable :/
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 1cac29e..ce4b298 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -172,8 +172,13 @@  static ssize_t speed_show(struct device *dev,
 
 	if (netif_running(netdev)) {
 		struct ethtool_cmd cmd;
-		if (!__ethtool_get_settings(netdev, &cmd))
-			ret = sprintf(buf, fmt_udec, ethtool_cmd_speed(&cmd));
+		if (!__ethtool_get_settings(netdev, &cmd)) {
+			__u32 speed = ethtool_cmd_speed(&cmd);
+
+			if (speed != (__u32) -1)
+				ret = sprintf(buf, fmt_udec,
+					      ethtool_cmd_speed(&cmd));
+		}
 	}
 	rtnl_unlock();
 	return ret;