Message ID | cover.1550513384.git.mkubecek@suse.cz |
---|---|
Headers | show |
Series | ethtool netlink interface, part 1 | expand |
Mon, Feb 18, 2019 at 07:21:24PM CET, mkubecek@suse.cz wrote: >Note: this is marked as RFC because it's rather late in the cycle; the plan >is to make a regular submission (with changes based on review) once >net-next reopens after the 5.1 merge window. The full (work in progress) >series, together with the (userspace) ethtool counterpart can be found at >https://github.com/mkubecek/ethnl > >This is first part of alternative userspace interface for ethtool. The aim >is to address some long known issues with the ioctl interface, mainly lack >of extensibility, raciness, limited error reporting and absence of >notifications. > >The interface uses generic netlink family "ethtool"; it provides multicast >group "monitor" which is used for notifications. Documentation for the >interface is in Documentation/networking/ethtool-netlink.txt > >Basic concepts: > >- the goal is to provide all features of ioctl interface but allow > easier future extensions; at some point, it should be possible to have > full ethtool functionality without using the ioctl interface I'm not sure it is a good idea to map the existing iface to netlink fully. There are things in ethtool which are not really unique for Ethernet. Those things should be put somewhere else. >- inextensibility of ioctl interface resulted in way too many commands, > many of them obsoleted by newer ones; reduce the number by ignoring the > obsolete commands and grouping some together >- for "set" type commands, allows passing only the attributes to be > changed; therefore we don't need a get-modify-set cycle (which is > inherently racy), userspace can simply say what it wants to change >- provide notifications to multicast group "monitor" like rtnetlink does, > i.e. in the form of messages close to replies to "get" requests >- allow dump requests to get some information about all network devices > providing it >- be less dependent on ethtool and kernel being in sync; allow e.g. saying > "ethtool -s eth0 advertise foo off" without ethtool knowing what "foo" > means; it's kernel's job to know what mode "xyz" is and if it exists and > is supported > >Main changes between RFC v2 and RFC v3: > >- do not allow building as a module (no netdev notifiers needed) >- drop some obsolete fields >- add permanent hw address, timestamping and private flags support >- rework bitset handling to get rid of variable length arrays >- notify monitor on device renames >- restructure GET_SETTINGS/SET_SETTINGS messages >- split too long patches and submit only first part of the series > >Main changes between RFC v1 and RFC v2: > >- support dumps for all "get" requests >- provide notifications for changes related to supported request types >- support getting string sets (both global and per device) >- support getting/setting device features >- get rid of family specific header, everything passed as attributes >- split netlink code into multiple files in net/ethtool/ directory > >ToDo / open questions: > >- some features provided by ethtool would rather belong to devlink (and > some are already superseded by devlink); however, only few drivers > provide devlink interface at the moment and as recent discussion on > flashing revealed, we cannot rely on devlink's presence Could you explain why please? > >- while the netlink interface allows easy future extensions, ethtool_ops > interface does not; some settings could be implemented using tunables and > accessed via relevant netlink messages (as well as tunables) from > userspace but in the long term, something better will be needed > >- currently, all communication with drivers via ethtool_ops is done > under RTNL as this is what ioctl interface does and likely many > ethtool_ops handlers rely on that; if we are going to rework ethtool_ops > in the future ("phase two"), it would be nice to get rid of it > >- ethtool_ops should pass extack pointer to allow drivers more meaningful > error reporting; it's not clear, however, how to pass information about > offending attribute > >- notifications are sent whenever a change is done via netlink API or > ioctl API and for netdev features also whenever they are updated using > netdev_change_features(); it would be desirable to notify also about > link state and negotiation result (speed/duplex and partner link > modes) but it would be more tricky > >Michal Kubecek (21): > netlink: introduce nla_put_bitfield32() > ethtool: move to its own directory > ethtool: introduce ethtool netlink interface > ethtool: helper functions for netlink interface > ethtool: netlink bitset handling > ethtool: support for netlink notifications > ethtool: implement EVENT notifications > ethtool: generic handlers for GET requests > ethtool: move string arrays into common file > ethtool: provide string sets with GET_STRSET request > ethtool: provide driver/device information in GET_INFO request > ethtool: provide permanent hardware address in GET_INFO request > ethtool: provide timestamping information in GET_INFO request > ethtool: provide link mode names as a string set > ethtool: provide link settings and link modes in GET_SETTINGS request > ethtool: provide WoL information in GET_SETTINGS request > ethtool: provide message level in GET_SETTINGS request > ethtool: provide link state in GET_SETTINGS request > ethtool: provide device features in GET_SETTINGS request > ethtool: provide private flags in GET_SETTINGS request > ethtool: send netlink notifications about setting changes > > Documentation/networking/ethtool-netlink.txt | 441 +++++++++++++ > include/linux/ethtool_netlink.h | 17 + > include/linux/netdevice.h | 14 + > include/net/netlink.h | 15 + > include/uapi/linux/ethtool.h | 10 + > include/uapi/linux/ethtool_netlink.h | 265 ++++++++ > include/uapi/linux/net_tstamp.h | 13 + > net/Kconfig | 8 + > net/Makefile | 2 +- > net/core/Makefile | 2 +- > net/ethtool/Makefile | 7 + > net/ethtool/bitset.c | 572 +++++++++++++++++ > net/ethtool/bitset.h | 40 ++ > net/ethtool/common.c | 227 +++++++ > net/ethtool/common.h | 29 + > net/ethtool/info.c | 332 ++++++++++ > net/{core/ethtool.c => ethtool/ioctl.c} | 244 ++----- > net/ethtool/netlink.c | 634 +++++++++++++++++++ > net/ethtool/netlink.h | 252 ++++++++ > net/ethtool/settings.c | 559 ++++++++++++++++ > net/ethtool/strset.c | 461 ++++++++++++++ > 21 files changed, 3937 insertions(+), 207 deletions(-) > create mode 100644 Documentation/networking/ethtool-netlink.txt > create mode 100644 include/linux/ethtool_netlink.h > create mode 100644 include/uapi/linux/ethtool_netlink.h > create mode 100644 net/ethtool/Makefile > create mode 100644 net/ethtool/bitset.c > create mode 100644 net/ethtool/bitset.h > create mode 100644 net/ethtool/common.c > create mode 100644 net/ethtool/common.h > create mode 100644 net/ethtool/info.c > rename net/{core/ethtool.c => ethtool/ioctl.c} (91%) > create mode 100644 net/ethtool/netlink.c > create mode 100644 net/ethtool/netlink.h > create mode 100644 net/ethtool/settings.c > create mode 100644 net/ethtool/strset.c > >-- >2.20.1 >
On Tue, Feb 19, 2019 at 11:35:08AM +0100, Jiri Pirko wrote: > >- some features provided by ethtool would rather belong to devlink (and > > some are already superseded by devlink); however, only few drivers > > provide devlink interface at the moment and as recent discussion on > > flashing revealed, we cannot rely on devlink's presence > > Could you explain why please? What I mean is the problem discussed under Jakub's devlink flash patchset: that he couldn't implement only the devlink callback in nfp and rely on the generic fallback to devlink because it wouldn't work if devlink is built as a module. But I think this should be addressed. If we agree that flashing (and other features provided by ethtool at the moment) rather belongs to devlink (which nobody seems to oppose), we should rather try to make it possible for drivers to provide only the devlink callback and gradually move all in-tree drivers to doing so. (And one day, remove it from ethtool_ops.) It doesn't seem to make much sense to have devlink as a module in such scenario. Michal
Tue, Feb 19, 2019 at 12:57:27PM CET, mkubecek@suse.cz wrote: >On Tue, Feb 19, 2019 at 11:35:08AM +0100, Jiri Pirko wrote: >> >- some features provided by ethtool would rather belong to devlink (and >> > some are already superseded by devlink); however, only few drivers >> > provide devlink interface at the moment and as recent discussion on >> > flashing revealed, we cannot rely on devlink's presence >> >> Could you explain why please? > >What I mean is the problem discussed under Jakub's devlink flash >patchset: that he couldn't implement only the devlink callback in nfp >and rely on the generic fallback to devlink because it wouldn't work if >devlink is built as a module. So let's fix that. > >But I think this should be addressed. If we agree that flashing (and >other features provided by ethtool at the moment) rather belongs to >devlink (which nobody seems to oppose), we should rather try to make it >possible for drivers to provide only the devlink callback and gradually >move all in-tree drivers to doing so. (And one day, remove it from >ethtool_ops.) It doesn't seem to make much sense to have devlink as >a module in such scenario. Agreed. > >Michal
Hi Michal, Let me start with a big thank you for tackling this humongous amount of work! On 2/18/2019 10:21 AM, Michal Kubecek wrote: > Note: this is marked as RFC because it's rather late in the cycle; the plan > is to make a regular submission (with changes based on review) once > net-next reopens after the 5.1 merge window. The full (work in progress) > series, together with the (userspace) ethtool counterpart can be found at > https://github.com/mkubecek/ethnl > > This is first part of alternative userspace interface for ethtool. The aim > is to address some long known issues with the ioctl interface, mainly lack > of extensibility, raciness, limited error reporting and absence of > notifications. > > The interface uses generic netlink family "ethtool"; it provides multicast > group "monitor" which is used for notifications. Documentation for the > interface is in Documentation/networking/ethtool-netlink.txt > > Basic concepts: > > - the goal is to provide all features of ioctl interface but allow > easier future extensions; at some point, it should be possible to have > full ethtool functionality without using the ioctl interface +1 > - inextensibility of ioctl interface resulted in way too many commands, > many of them obsoleted by newer ones; reduce the number by ignoring the > obsolete commands and grouping some together > - for "set" type commands, allows passing only the attributes to be > changed; therefore we don't need a get-modify-set cycle (which is > inherently racy), userspace can simply say what it wants to change > - provide notifications to multicast group "monitor" like rtnetlink does, > i.e. in the form of messages close to replies to "get" requests +1 > - allow dump requests to get some information about all network devices > providing it And it seems like you have also added some filtering capability with at least the settings dump, right? That is also highly welcome. > - be less dependent on ethtool and kernel being in sync; allow e.g. saying > "ethtool -s eth0 advertise foo off" without ethtool knowing what "foo" > means; it's kernel's job to know what mode "xyz" is and if it exists and > is supported > > Main changes between RFC v2 and RFC v3: > > - do not allow building as a module (no netdev notifiers needed) I would very much prefer that we could build this as a module. Some systems might be memory constrained or opt to disable ethtool entirely (security?). If this is not too much trouble, can we try to maintain that? > - drop some obsolete fields > - add permanent hw address, timestamping and private flags support > - rework bitset handling to get rid of variable length arrays > - notify monitor on device renames > - restructure GET_SETTINGS/SET_SETTINGS messages > - split too long patches and submit only first part of the series > > Main changes between RFC v1 and RFC v2: > > - support dumps for all "get" requests > - provide notifications for changes related to supported request types > - support getting string sets (both global and per device) > - support getting/setting device features > - get rid of family specific header, everything passed as attributes > - split netlink code into multiple files in net/ethtool/ directory > > ToDo / open questions: > > - some features provided by ethtool would rather belong to devlink (and > some are already superseded by devlink); however, only few drivers > provide devlink interface at the moment and as recent discussion on > flashing revealed, we cannot rely on devlink's presence Should we just move everything under devlink given that ethtool over netlink or devlink are essentially manipulating the same (or close to) objects? It seems to me that at least now is the right time to make decisions about what should stay in ethtool and be offered via netlink, what should be migrated to something else (e.g.: Jiri mentioned rtnetlink), and what is a devlink candidate. How do we proceed to easily triage which of these netlink facilities should things be routed to/from? > > - while the netlink interface allows easy future extensions, ethtool_ops > interface does not; some settings could be implemented using tunables and > accessed via relevant netlink messages (as well as tunables) from > userspace but in the long term, something better will be needed Right, so as a driver writer, one thing I kind of hate is having to fill a netlink skb myself because that's a lot of work, and it makes it fairly hard to do mass conversion of internal APIs due to the increased complexity in auditing what drivers do. > > - currently, all communication with drivers via ethtool_ops is done > under RTNL as this is what ioctl interface does and likely many > ethtool_ops handlers rely on that; if we are going to rework ethtool_ops > in the future ("phase two"), it would be nice to get rid of it > > - ethtool_ops should pass extack pointer to allow drivers more meaningful > error reporting; it's not clear, however, how to pass information about > offending attribute > > - notifications are sent whenever a change is done via netlink API or > ioctl API and for netdev features also whenever they are updated using > netdev_change_features(); it would be desirable to notify also about > link state and negotiation result (speed/duplex and partner link > modes) but it would be more tricky Historically you listen for a link event and you determine the link parameters from there, I don't know if there is much value in giving the full link parameter list, especially given those could theoretically change depending on your medium (e.g.: wireless), or there could be a lot to query at that time... > > Michal Kubecek (21): > netlink: introduce nla_put_bitfield32() > ethtool: move to its own directory > ethtool: introduce ethtool netlink interface > ethtool: helper functions for netlink interface > ethtool: netlink bitset handling > ethtool: support for netlink notifications > ethtool: implement EVENT notifications > ethtool: generic handlers for GET requests > ethtool: move string arrays into common file > ethtool: provide string sets with GET_STRSET request > ethtool: provide driver/device information in GET_INFO request > ethtool: provide permanent hardware address in GET_INFO request > ethtool: provide timestamping information in GET_INFO request > ethtool: provide link mode names as a string set > ethtool: provide link settings and link modes in GET_SETTINGS request > ethtool: provide WoL information in GET_SETTINGS request > ethtool: provide message level in GET_SETTINGS request > ethtool: provide link state in GET_SETTINGS request > ethtool: provide device features in GET_SETTINGS request > ethtool: provide private flags in GET_SETTINGS request > ethtool: send netlink notifications about setting changes > > Documentation/networking/ethtool-netlink.txt | 441 +++++++++++++ > include/linux/ethtool_netlink.h | 17 + > include/linux/netdevice.h | 14 + > include/net/netlink.h | 15 + > include/uapi/linux/ethtool.h | 10 + > include/uapi/linux/ethtool_netlink.h | 265 ++++++++ > include/uapi/linux/net_tstamp.h | 13 + > net/Kconfig | 8 + > net/Makefile | 2 +- > net/core/Makefile | 2 +- > net/ethtool/Makefile | 7 + > net/ethtool/bitset.c | 572 +++++++++++++++++ > net/ethtool/bitset.h | 40 ++ > net/ethtool/common.c | 227 +++++++ > net/ethtool/common.h | 29 + > net/ethtool/info.c | 332 ++++++++++ > net/{core/ethtool.c => ethtool/ioctl.c} | 244 ++----- > net/ethtool/netlink.c | 634 +++++++++++++++++++ > net/ethtool/netlink.h | 252 ++++++++ > net/ethtool/settings.c | 559 ++++++++++++++++ > net/ethtool/strset.c | 461 ++++++++++++++ > 21 files changed, 3937 insertions(+), 207 deletions(-) > create mode 100644 Documentation/networking/ethtool-netlink.txt > create mode 100644 include/linux/ethtool_netlink.h > create mode 100644 include/uapi/linux/ethtool_netlink.h > create mode 100644 net/ethtool/Makefile > create mode 100644 net/ethtool/bitset.c > create mode 100644 net/ethtool/bitset.h > create mode 100644 net/ethtool/common.c > create mode 100644 net/ethtool/common.h > create mode 100644 net/ethtool/info.c > rename net/{core/ethtool.c => ethtool/ioctl.c} (91%) > create mode 100644 net/ethtool/netlink.c > create mode 100644 net/ethtool/netlink.h > create mode 100644 net/ethtool/settings.c > create mode 100644 net/ethtool/strset.c >
On Wed, Feb 20, 2019 at 07:21:18PM -0800, Florian Fainelli wrote: > > Main changes between RFC v2 and RFC v3: > > > > - do not allow building as a module (no netdev notifiers needed) > > I would very much prefer that we could build this as a module. Some > systems might be memory constrained or opt to disable ethtool entirely > (security?). If this is not too much trouble, can we try to maintain that? There were two reasons why I agreed so quickly to the suggestion to make CONFIG_ETHTOOL a bool: 1. I didn't really think there would be anyone who would benefit from having the interface as a loadable module and the only reason why I made it possible originally was that it was easier to test changes that way. That was rather poor reason for keeping it once submitted but if there is an actual demand from real life users, it's something different. 2. Using of ethtool netdev notifiers for triggering netlink notifications from other code (mostly ioctl interface) was rather clumsy. However, it could be worked around using an RCU pointer and some simple helpers (as discussed in the devlink thread) so this not not a serious problem either. > > - some features provided by ethtool would rather belong to devlink (and > > some are already superseded by devlink); however, only few drivers > > provide devlink interface at the moment and as recent discussion on > > flashing revealed, we cannot rely on devlink's presence > > Should we just move everything under devlink given that ethtool over > netlink or devlink are essentially manipulating the same (or close to) > objects? It seems to me that at least now is the right time to make > decisions about what should stay in ethtool and be offered via netlink, > what should be migrated to something else (e.g.: Jiri mentioned > rtnetlink), and what is a devlink candidate. > > How do we proceed to easily triage which of these netlink facilities > should things be routed to/from? At the moment I would say the likely candidates for rtnetlink are permanent hw address (ethtool -P), netdev features (-k / -K) and link state (part of "ethtool <dev>"). (I'm not very happy about the prospect of moving netdev features to rtnetlink but it seems hard to argue why they shouldn't belong there.) As for devlink, I would see eeprom dump (-e) and write (-E), register dump (-d), firmware dump (-w / -W) and flash (-f), and module eeprom dump (-m) as candidates. Maybe also device reset (--reset) (which was put into ethtool_ops, though, even if added only recently). There is also another question: at which level to do the split. I would like to avoid doint it between user and ethtool utility, i.e. dropping some functions from ethtool and pointing people to iproute2 tools (or even replacing ethtool with some new tool). There is the "ifconfig memento": 20 years after it became obsolete, I still keep hearing people say "I don't want to learn some new tool with unfamiliar syntax". (Some of them learning to use ifconfig few years ago doesn't really help. Neither them saying the same about ss which copied the most of netstat's basic options.) I'm afraid the result would be exactly the same with ethtool. Then there is an option of providing the functions in ethtool but using both ethtool netlink and devlink (and probably also rtnetlink) to talk to kernel. This means a bit more complicated userspace code but nothing serious. The bigger problem I see with this approach is that so far only few NIC drivers have any devlink support. Unless we would want to add ethtool_ops fallback for devlink, we would have to wait until the relevant features are available via devlink in all in-tree NIC drivers which provide them via ethtool_ops now. Providing the features via ethtool netlink interface but having them backed up by devlink would mean we would have two userspace interfaces for the same (or almost the same) function. It's something I could live with but I understand that not everyone would like this. The advantage is that it would allow migrating the features from ethtool_ops to devlink gradually. > > - while the netlink interface allows easy future extensions, ethtool_ops > > interface does not; some settings could be implemented using tunables and > > accessed via relevant netlink messages (as well as tunables) from > > userspace but in the long term, something better will be needed > > Right, so as a driver writer, one thing I kind of hate is having to fill > a netlink skb myself because that's a lot of work, and it makes it > fairly hard to do mass conversion of internal APIs due to the increased > complexity in auditing what drivers do. I would also prefer to avoid parsing and generating netlink messages in drivers for regular operations. Michal