mbox series

[RFC,net-next,v3,00/21] ethtool netlink interface, part 1

Message ID cover.1550513384.git.mkubecek@suse.cz
Headers show
Series ethtool netlink interface, part 1 | expand

Message

Michal Kubecek Feb. 18, 2019, 6:21 p.m. UTC
Note: this is marked as RFC because it's rather late in the cycle; the plan
is to make a regular submission (with changes based on review) once
net-next reopens after the 5.1 merge window. The full (work in progress)
series, together with the (userspace) ethtool counterpart can be found at
https://github.com/mkubecek/ethnl

This is first part of alternative userspace interface for ethtool. The aim
is to address some long known issues with the ioctl interface, mainly lack
of extensibility, raciness, limited error reporting and absence of
notifications.

The interface uses generic netlink family "ethtool"; it provides multicast
group "monitor" which is used for notifications. Documentation for the
interface is in Documentation/networking/ethtool-netlink.txt

Basic concepts:

- the goal is to provide all features of ioctl interface but allow
  easier future extensions; at some point, it should be possible to have
  full ethtool functionality without using the ioctl interface
- inextensibility of ioctl interface resulted in way too many commands,
  many of them obsoleted by newer ones; reduce the number by  ignoring the
  obsolete commands and grouping some together
- for "set" type commands, allows passing only the attributes to be
  changed; therefore we don't need a get-modify-set cycle (which is
  inherently racy), userspace can simply say what it wants to change
- provide notifications to multicast group "monitor" like rtnetlink does,
  i.e. in the form of messages close to replies to "get" requests
- allow dump requests to get some information about all network devices
  providing it
- be less dependent on ethtool and kernel being in sync; allow e.g. saying
  "ethtool -s eth0 advertise foo off" without ethtool knowing what "foo"
  means; it's kernel's job to know what mode "xyz" is and if it exists and
  is supported

Main changes between RFC v2 and RFC v3:

- do not allow building as a module (no netdev notifiers needed)
- drop some obsolete fields
- add permanent hw address, timestamping and private flags support
- rework bitset handling to get rid of variable length arrays
- notify monitor on device renames
- restructure GET_SETTINGS/SET_SETTINGS messages
- split too long patches and submit only first part of the series

Main changes between RFC v1 and RFC v2:

- support dumps for all "get" requests
- provide notifications for changes related to supported request types
- support getting string sets (both global and per device)
- support getting/setting device features
- get rid of family specific header, everything passed as attributes
- split netlink code into multiple files in net/ethtool/ directory

ToDo / open questions:

- some features provided by ethtool would rather belong to devlink (and
  some are already superseded by devlink); however, only few drivers
  provide devlink interface at the moment and as recent discussion on
  flashing revealed, we cannot rely on devlink's presence

- while the netlink interface allows easy future extensions, ethtool_ops
  interface does not; some settings could be implemented using tunables and
  accessed via relevant netlink messages (as well as tunables) from
  userspace but in the long term, something better will be needed

- currently, all communication with drivers via ethtool_ops is done
  under RTNL as this is what ioctl interface does and likely many
  ethtool_ops handlers rely on that; if we are going to rework ethtool_ops
  in the future ("phase two"), it would be nice to get rid of it

- ethtool_ops should pass extack pointer to allow drivers more meaningful
  error reporting; it's not clear, however, how to pass information about
  offending attribute

- notifications are sent whenever a change is done via netlink API or
  ioctl API and for netdev features also whenever they are updated using
  netdev_change_features(); it would be desirable to notify also about
  link state and negotiation result (speed/duplex and partner link
  modes) but it would be more tricky

Michal Kubecek (21):
  netlink: introduce nla_put_bitfield32()
  ethtool: move to its own directory
  ethtool: introduce ethtool netlink interface
  ethtool: helper functions for netlink interface
  ethtool: netlink bitset handling
  ethtool: support for netlink notifications
  ethtool: implement EVENT notifications
  ethtool: generic handlers for GET requests
  ethtool: move string arrays into common file
  ethtool: provide string sets with GET_STRSET request
  ethtool: provide driver/device information in GET_INFO request
  ethtool: provide permanent hardware address in GET_INFO request
  ethtool: provide timestamping information in GET_INFO request
  ethtool: provide link mode names as a string set
  ethtool: provide link settings and link modes in GET_SETTINGS request
  ethtool: provide WoL information in GET_SETTINGS request
  ethtool: provide message level in GET_SETTINGS request
  ethtool: provide link state in GET_SETTINGS request
  ethtool: provide device features in GET_SETTINGS request
  ethtool: provide private flags in GET_SETTINGS request
  ethtool: send netlink notifications about setting changes

 Documentation/networking/ethtool-netlink.txt | 441 +++++++++++++
 include/linux/ethtool_netlink.h              |  17 +
 include/linux/netdevice.h                    |  14 +
 include/net/netlink.h                        |  15 +
 include/uapi/linux/ethtool.h                 |  10 +
 include/uapi/linux/ethtool_netlink.h         | 265 ++++++++
 include/uapi/linux/net_tstamp.h              |  13 +
 net/Kconfig                                  |   8 +
 net/Makefile                                 |   2 +-
 net/core/Makefile                            |   2 +-
 net/ethtool/Makefile                         |   7 +
 net/ethtool/bitset.c                         | 572 +++++++++++++++++
 net/ethtool/bitset.h                         |  40 ++
 net/ethtool/common.c                         | 227 +++++++
 net/ethtool/common.h                         |  29 +
 net/ethtool/info.c                           | 332 ++++++++++
 net/{core/ethtool.c => ethtool/ioctl.c}      | 244 ++-----
 net/ethtool/netlink.c                        | 634 +++++++++++++++++++
 net/ethtool/netlink.h                        | 252 ++++++++
 net/ethtool/settings.c                       | 559 ++++++++++++++++
 net/ethtool/strset.c                         | 461 ++++++++++++++
 21 files changed, 3937 insertions(+), 207 deletions(-)
 create mode 100644 Documentation/networking/ethtool-netlink.txt
 create mode 100644 include/linux/ethtool_netlink.h
 create mode 100644 include/uapi/linux/ethtool_netlink.h
 create mode 100644 net/ethtool/Makefile
 create mode 100644 net/ethtool/bitset.c
 create mode 100644 net/ethtool/bitset.h
 create mode 100644 net/ethtool/common.c
 create mode 100644 net/ethtool/common.h
 create mode 100644 net/ethtool/info.c
 rename net/{core/ethtool.c => ethtool/ioctl.c} (91%)
 create mode 100644 net/ethtool/netlink.c
 create mode 100644 net/ethtool/netlink.h
 create mode 100644 net/ethtool/settings.c
 create mode 100644 net/ethtool/strset.c

Comments

Jiri Pirko Feb. 19, 2019, 10:35 a.m. UTC | #1
Mon, Feb 18, 2019 at 07:21:24PM CET, mkubecek@suse.cz wrote:
>Note: this is marked as RFC because it's rather late in the cycle; the plan
>is to make a regular submission (with changes based on review) once
>net-next reopens after the 5.1 merge window. The full (work in progress)
>series, together with the (userspace) ethtool counterpart can be found at
>https://github.com/mkubecek/ethnl
>
>This is first part of alternative userspace interface for ethtool. The aim
>is to address some long known issues with the ioctl interface, mainly lack
>of extensibility, raciness, limited error reporting and absence of
>notifications.
>
>The interface uses generic netlink family "ethtool"; it provides multicast
>group "monitor" which is used for notifications. Documentation for the
>interface is in Documentation/networking/ethtool-netlink.txt
>
>Basic concepts:
>
>- the goal is to provide all features of ioctl interface but allow
>  easier future extensions; at some point, it should be possible to have
>  full ethtool functionality without using the ioctl interface

I'm not sure it is a good idea to map the existing iface to netlink
fully. There are things in ethtool which are not really unique for
Ethernet. Those things should be put somewhere else.


>- inextensibility of ioctl interface resulted in way too many commands,
>  many of them obsoleted by newer ones; reduce the number by  ignoring the
>  obsolete commands and grouping some together
>- for "set" type commands, allows passing only the attributes to be
>  changed; therefore we don't need a get-modify-set cycle (which is
>  inherently racy), userspace can simply say what it wants to change
>- provide notifications to multicast group "monitor" like rtnetlink does,
>  i.e. in the form of messages close to replies to "get" requests
>- allow dump requests to get some information about all network devices
>  providing it
>- be less dependent on ethtool and kernel being in sync; allow e.g. saying
>  "ethtool -s eth0 advertise foo off" without ethtool knowing what "foo"
>  means; it's kernel's job to know what mode "xyz" is and if it exists and
>  is supported
>
>Main changes between RFC v2 and RFC v3:
>
>- do not allow building as a module (no netdev notifiers needed)
>- drop some obsolete fields
>- add permanent hw address, timestamping and private flags support
>- rework bitset handling to get rid of variable length arrays
>- notify monitor on device renames
>- restructure GET_SETTINGS/SET_SETTINGS messages
>- split too long patches and submit only first part of the series
>
>Main changes between RFC v1 and RFC v2:
>
>- support dumps for all "get" requests
>- provide notifications for changes related to supported request types
>- support getting string sets (both global and per device)
>- support getting/setting device features
>- get rid of family specific header, everything passed as attributes
>- split netlink code into multiple files in net/ethtool/ directory
>
>ToDo / open questions:
>
>- some features provided by ethtool would rather belong to devlink (and
>  some are already superseded by devlink); however, only few drivers
>  provide devlink interface at the moment and as recent discussion on
>  flashing revealed, we cannot rely on devlink's presence

Could you explain why please?


>
>- while the netlink interface allows easy future extensions, ethtool_ops
>  interface does not; some settings could be implemented using tunables and
>  accessed via relevant netlink messages (as well as tunables) from
>  userspace but in the long term, something better will be needed
>
>- currently, all communication with drivers via ethtool_ops is done
>  under RTNL as this is what ioctl interface does and likely many
>  ethtool_ops handlers rely on that; if we are going to rework ethtool_ops
>  in the future ("phase two"), it would be nice to get rid of it
>
>- ethtool_ops should pass extack pointer to allow drivers more meaningful
>  error reporting; it's not clear, however, how to pass information about
>  offending attribute
>
>- notifications are sent whenever a change is done via netlink API or
>  ioctl API and for netdev features also whenever they are updated using
>  netdev_change_features(); it would be desirable to notify also about
>  link state and negotiation result (speed/duplex and partner link
>  modes) but it would be more tricky
>
>Michal Kubecek (21):
>  netlink: introduce nla_put_bitfield32()
>  ethtool: move to its own directory
>  ethtool: introduce ethtool netlink interface
>  ethtool: helper functions for netlink interface
>  ethtool: netlink bitset handling
>  ethtool: support for netlink notifications
>  ethtool: implement EVENT notifications
>  ethtool: generic handlers for GET requests
>  ethtool: move string arrays into common file
>  ethtool: provide string sets with GET_STRSET request
>  ethtool: provide driver/device information in GET_INFO request
>  ethtool: provide permanent hardware address in GET_INFO request
>  ethtool: provide timestamping information in GET_INFO request
>  ethtool: provide link mode names as a string set
>  ethtool: provide link settings and link modes in GET_SETTINGS request
>  ethtool: provide WoL information in GET_SETTINGS request
>  ethtool: provide message level in GET_SETTINGS request
>  ethtool: provide link state in GET_SETTINGS request
>  ethtool: provide device features in GET_SETTINGS request
>  ethtool: provide private flags in GET_SETTINGS request
>  ethtool: send netlink notifications about setting changes
>
> Documentation/networking/ethtool-netlink.txt | 441 +++++++++++++
> include/linux/ethtool_netlink.h              |  17 +
> include/linux/netdevice.h                    |  14 +
> include/net/netlink.h                        |  15 +
> include/uapi/linux/ethtool.h                 |  10 +
> include/uapi/linux/ethtool_netlink.h         | 265 ++++++++
> include/uapi/linux/net_tstamp.h              |  13 +
> net/Kconfig                                  |   8 +
> net/Makefile                                 |   2 +-
> net/core/Makefile                            |   2 +-
> net/ethtool/Makefile                         |   7 +
> net/ethtool/bitset.c                         | 572 +++++++++++++++++
> net/ethtool/bitset.h                         |  40 ++
> net/ethtool/common.c                         | 227 +++++++
> net/ethtool/common.h                         |  29 +
> net/ethtool/info.c                           | 332 ++++++++++
> net/{core/ethtool.c => ethtool/ioctl.c}      | 244 ++-----
> net/ethtool/netlink.c                        | 634 +++++++++++++++++++
> net/ethtool/netlink.h                        | 252 ++++++++
> net/ethtool/settings.c                       | 559 ++++++++++++++++
> net/ethtool/strset.c                         | 461 ++++++++++++++
> 21 files changed, 3937 insertions(+), 207 deletions(-)
> create mode 100644 Documentation/networking/ethtool-netlink.txt
> create mode 100644 include/linux/ethtool_netlink.h
> create mode 100644 include/uapi/linux/ethtool_netlink.h
> create mode 100644 net/ethtool/Makefile
> create mode 100644 net/ethtool/bitset.c
> create mode 100644 net/ethtool/bitset.h
> create mode 100644 net/ethtool/common.c
> create mode 100644 net/ethtool/common.h
> create mode 100644 net/ethtool/info.c
> rename net/{core/ethtool.c => ethtool/ioctl.c} (91%)
> create mode 100644 net/ethtool/netlink.c
> create mode 100644 net/ethtool/netlink.h
> create mode 100644 net/ethtool/settings.c
> create mode 100644 net/ethtool/strset.c
>
>-- 
>2.20.1
>
Michal Kubecek Feb. 19, 2019, 11:57 a.m. UTC | #2
On Tue, Feb 19, 2019 at 11:35:08AM +0100, Jiri Pirko wrote:
> >- some features provided by ethtool would rather belong to devlink (and
> >  some are already superseded by devlink); however, only few drivers
> >  provide devlink interface at the moment and as recent discussion on
> >  flashing revealed, we cannot rely on devlink's presence
> 
> Could you explain why please?

What I mean is the problem discussed under Jakub's devlink flash
patchset: that he couldn't implement only the devlink callback in nfp
and rely on the generic fallback to devlink because it wouldn't work if
devlink is built as a module.

But I think this should be addressed. If we agree that flashing (and
other features provided by ethtool at the moment) rather belongs to
devlink (which nobody seems to oppose), we should rather try to make it
possible for drivers to provide only the devlink callback and gradually
move all in-tree drivers to doing so. (And one day, remove it from
ethtool_ops.) It doesn't seem to make much sense to have devlink as
a module in such scenario.

Michal
Jiri Pirko Feb. 19, 2019, 12:27 p.m. UTC | #3
Tue, Feb 19, 2019 at 12:57:27PM CET, mkubecek@suse.cz wrote:
>On Tue, Feb 19, 2019 at 11:35:08AM +0100, Jiri Pirko wrote:
>> >- some features provided by ethtool would rather belong to devlink (and
>> >  some are already superseded by devlink); however, only few drivers
>> >  provide devlink interface at the moment and as recent discussion on
>> >  flashing revealed, we cannot rely on devlink's presence
>> 
>> Could you explain why please?
>
>What I mean is the problem discussed under Jakub's devlink flash
>patchset: that he couldn't implement only the devlink callback in nfp
>and rely on the generic fallback to devlink because it wouldn't work if
>devlink is built as a module.

So let's fix that.


>
>But I think this should be addressed. If we agree that flashing (and
>other features provided by ethtool at the moment) rather belongs to
>devlink (which nobody seems to oppose), we should rather try to make it
>possible for drivers to provide only the devlink callback and gradually
>move all in-tree drivers to doing so. (And one day, remove it from
>ethtool_ops.) It doesn't seem to make much sense to have devlink as
>a module in such scenario.

Agreed.

>
>Michal
Florian Fainelli Feb. 21, 2019, 3:21 a.m. UTC | #4
Hi Michal,

Let me start with a big thank you for tackling this humongous amount of
work!

On 2/18/2019 10:21 AM, Michal Kubecek wrote:
> Note: this is marked as RFC because it's rather late in the cycle; the plan
> is to make a regular submission (with changes based on review) once
> net-next reopens after the 5.1 merge window. The full (work in progress)
> series, together with the (userspace) ethtool counterpart can be found at
> https://github.com/mkubecek/ethnl
> 
> This is first part of alternative userspace interface for ethtool. The aim
> is to address some long known issues with the ioctl interface, mainly lack
> of extensibility, raciness, limited error reporting and absence of
> notifications.
> 
> The interface uses generic netlink family "ethtool"; it provides multicast
> group "monitor" which is used for notifications. Documentation for the
> interface is in Documentation/networking/ethtool-netlink.txt
> 
> Basic concepts:
> 
> - the goal is to provide all features of ioctl interface but allow
>   easier future extensions; at some point, it should be possible to have
>   full ethtool functionality without using the ioctl interface

+1

> - inextensibility of ioctl interface resulted in way too many commands,
>   many of them obsoleted by newer ones; reduce the number by  ignoring the
>   obsolete commands and grouping some together
> - for "set" type commands, allows passing only the attributes to be
>   changed; therefore we don't need a get-modify-set cycle (which is
>   inherently racy), userspace can simply say what it wants to change
> - provide notifications to multicast group "monitor" like rtnetlink does,
>   i.e. in the form of messages close to replies to "get" requests

+1

> - allow dump requests to get some information about all network devices
>   providing it

And it seems like you have also added some filtering capability with at
least the settings dump, right? That is also highly welcome.

> - be less dependent on ethtool and kernel being in sync; allow e.g. saying
>   "ethtool -s eth0 advertise foo off" without ethtool knowing what "foo"
>   means; it's kernel's job to know what mode "xyz" is and if it exists and
>   is supported
> 
> Main changes between RFC v2 and RFC v3:
> 
> - do not allow building as a module (no netdev notifiers needed)

I would very much prefer that we could build this as a module. Some
systems might be memory constrained or opt to disable ethtool entirely
(security?). If this is not too much trouble, can we try to maintain that?

> - drop some obsolete fields
> - add permanent hw address, timestamping and private flags support
> - rework bitset handling to get rid of variable length arrays
> - notify monitor on device renames
> - restructure GET_SETTINGS/SET_SETTINGS messages
> - split too long patches and submit only first part of the series
> 
> Main changes between RFC v1 and RFC v2:
> 
> - support dumps for all "get" requests
> - provide notifications for changes related to supported request types
> - support getting string sets (both global and per device)
> - support getting/setting device features
> - get rid of family specific header, everything passed as attributes
> - split netlink code into multiple files in net/ethtool/ directory
> 
> ToDo / open questions:
> 
> - some features provided by ethtool would rather belong to devlink (and
>   some are already superseded by devlink); however, only few drivers
>   provide devlink interface at the moment and as recent discussion on
>   flashing revealed, we cannot rely on devlink's presence

Should we just move everything under devlink given that ethtool over
netlink or devlink are essentially manipulating the same (or close to)
objects? It seems to me that at least now is the right time to make
decisions about what should stay in ethtool and be offered via netlink,
what should be migrated to something else (e.g.: Jiri mentioned
rtnetlink), and what is a devlink candidate.

How do we proceed to easily triage which of these netlink facilities
should things be routed to/from?

> 
> - while the netlink interface allows easy future extensions, ethtool_ops
>   interface does not; some settings could be implemented using tunables and
>   accessed via relevant netlink messages (as well as tunables) from
>   userspace but in the long term, something better will be needed

Right, so as a driver writer, one thing I kind of hate is having to fill
a netlink skb myself because that's a lot of work, and it makes it
fairly hard to do mass conversion of internal APIs due to the increased
complexity in auditing what drivers do.

> 
> - currently, all communication with drivers via ethtool_ops is done
>   under RTNL as this is what ioctl interface does and likely many
>   ethtool_ops handlers rely on that; if we are going to rework ethtool_ops
>   in the future ("phase two"), it would be nice to get rid of it
> 
> - ethtool_ops should pass extack pointer to allow drivers more meaningful
>   error reporting; it's not clear, however, how to pass information about
>   offending attribute
> 
> - notifications are sent whenever a change is done via netlink API or
>   ioctl API and for netdev features also whenever they are updated using
>   netdev_change_features(); it would be desirable to notify also about
>   link state and negotiation result (speed/duplex and partner link
>   modes) but it would be more tricky

Historically you listen for a link event and you determine the link
parameters from there, I don't know if there is much value in giving the
full link parameter list, especially given those could theoretically
change depending on your medium (e.g.: wireless), or there could be a
lot to query at that time...

> 
> Michal Kubecek (21):
>   netlink: introduce nla_put_bitfield32()
>   ethtool: move to its own directory
>   ethtool: introduce ethtool netlink interface
>   ethtool: helper functions for netlink interface
>   ethtool: netlink bitset handling
>   ethtool: support for netlink notifications
>   ethtool: implement EVENT notifications
>   ethtool: generic handlers for GET requests
>   ethtool: move string arrays into common file
>   ethtool: provide string sets with GET_STRSET request
>   ethtool: provide driver/device information in GET_INFO request
>   ethtool: provide permanent hardware address in GET_INFO request
>   ethtool: provide timestamping information in GET_INFO request
>   ethtool: provide link mode names as a string set
>   ethtool: provide link settings and link modes in GET_SETTINGS request
>   ethtool: provide WoL information in GET_SETTINGS request
>   ethtool: provide message level in GET_SETTINGS request
>   ethtool: provide link state in GET_SETTINGS request
>   ethtool: provide device features in GET_SETTINGS request
>   ethtool: provide private flags in GET_SETTINGS request
>   ethtool: send netlink notifications about setting changes
> 
>  Documentation/networking/ethtool-netlink.txt | 441 +++++++++++++
>  include/linux/ethtool_netlink.h              |  17 +
>  include/linux/netdevice.h                    |  14 +
>  include/net/netlink.h                        |  15 +
>  include/uapi/linux/ethtool.h                 |  10 +
>  include/uapi/linux/ethtool_netlink.h         | 265 ++++++++
>  include/uapi/linux/net_tstamp.h              |  13 +
>  net/Kconfig                                  |   8 +
>  net/Makefile                                 |   2 +-
>  net/core/Makefile                            |   2 +-
>  net/ethtool/Makefile                         |   7 +
>  net/ethtool/bitset.c                         | 572 +++++++++++++++++
>  net/ethtool/bitset.h                         |  40 ++
>  net/ethtool/common.c                         | 227 +++++++
>  net/ethtool/common.h                         |  29 +
>  net/ethtool/info.c                           | 332 ++++++++++
>  net/{core/ethtool.c => ethtool/ioctl.c}      | 244 ++-----
>  net/ethtool/netlink.c                        | 634 +++++++++++++++++++
>  net/ethtool/netlink.h                        | 252 ++++++++
>  net/ethtool/settings.c                       | 559 ++++++++++++++++
>  net/ethtool/strset.c                         | 461 ++++++++++++++
>  21 files changed, 3937 insertions(+), 207 deletions(-)
>  create mode 100644 Documentation/networking/ethtool-netlink.txt
>  create mode 100644 include/linux/ethtool_netlink.h
>  create mode 100644 include/uapi/linux/ethtool_netlink.h
>  create mode 100644 net/ethtool/Makefile
>  create mode 100644 net/ethtool/bitset.c
>  create mode 100644 net/ethtool/bitset.h
>  create mode 100644 net/ethtool/common.c
>  create mode 100644 net/ethtool/common.h
>  create mode 100644 net/ethtool/info.c
>  rename net/{core/ethtool.c => ethtool/ioctl.c} (91%)
>  create mode 100644 net/ethtool/netlink.c
>  create mode 100644 net/ethtool/netlink.h
>  create mode 100644 net/ethtool/settings.c
>  create mode 100644 net/ethtool/strset.c
>
Michal Kubecek Feb. 21, 2019, 9:54 a.m. UTC | #5
On Wed, Feb 20, 2019 at 07:21:18PM -0800, Florian Fainelli wrote:
> > Main changes between RFC v2 and RFC v3:
> > 
> > - do not allow building as a module (no netdev notifiers needed)
> 
> I would very much prefer that we could build this as a module. Some
> systems might be memory constrained or opt to disable ethtool entirely
> (security?). If this is not too much trouble, can we try to maintain that?

There were two reasons why I agreed so quickly to the suggestion to make
CONFIG_ETHTOOL a bool:

1. I didn't really think there would be anyone who would benefit from
having the interface as a loadable module and the only reason why I made
it possible originally was that it was easier to test changes that way.
That was rather poor reason for keeping it once submitted but if there
is an actual demand from real life users, it's something different.

2. Using of ethtool netdev notifiers for triggering netlink
notifications from other code (mostly ioctl interface) was rather
clumsy. However, it could be worked around using an RCU pointer and some
simple helpers (as discussed in the devlink thread) so this not not
a serious problem either.

> > - some features provided by ethtool would rather belong to devlink (and
> >   some are already superseded by devlink); however, only few drivers
> >   provide devlink interface at the moment and as recent discussion on
> >   flashing revealed, we cannot rely on devlink's presence
> 
> Should we just move everything under devlink given that ethtool over
> netlink or devlink are essentially manipulating the same (or close to)
> objects? It seems to me that at least now is the right time to make
> decisions about what should stay in ethtool and be offered via netlink,
> what should be migrated to something else (e.g.: Jiri mentioned
> rtnetlink), and what is a devlink candidate.
> 
> How do we proceed to easily triage which of these netlink facilities
> should things be routed to/from?

At the moment I would say the likely candidates for rtnetlink are
permanent hw address (ethtool -P), netdev features (-k / -K) and link
state (part of "ethtool <dev>"). (I'm not very happy about the prospect
of moving netdev features to rtnetlink but it seems hard to argue why
they shouldn't belong there.) As for devlink, I would see eeprom dump
(-e) and write (-E), register dump (-d), firmware dump (-w / -W) and
flash (-f), and module eeprom dump (-m) as candidates. Maybe also device
reset (--reset) (which was put into ethtool_ops, though, even if added
only recently).

There is also another question: at which level to do the split.

I would like to avoid doint it between user and ethtool utility, i.e.
dropping some functions from ethtool and pointing people to iproute2
tools (or even replacing ethtool with some new tool). There is the
"ifconfig memento": 20 years after it became obsolete, I still keep
hearing people say "I don't want to learn some new tool with unfamiliar
syntax". (Some of them learning to use ifconfig few years ago doesn't
really help. Neither them saying the same about ss which copied the most
of netstat's basic options.) I'm afraid the result would be exactly the
same with ethtool.

Then there is an option of providing the functions in ethtool but using
both ethtool netlink and devlink (and probably also rtnetlink) to talk
to kernel. This means a bit more complicated userspace code but nothing
serious. The bigger problem I see with this approach is that so far only
few NIC drivers have any devlink support. Unless we would want to add
ethtool_ops fallback for devlink, we would have to wait until the
relevant features are available via devlink in all in-tree NIC drivers
which provide them via ethtool_ops now.

Providing the features via ethtool netlink interface but having them
backed up by devlink would mean we would have two userspace interfaces
for the same (or almost the same) function. It's something I could live
with but I understand that not everyone would like this. The advantage
is that it would allow migrating the features from ethtool_ops to
devlink gradually.

> > - while the netlink interface allows easy future extensions, ethtool_ops
> >   interface does not; some settings could be implemented using tunables and
> >   accessed via relevant netlink messages (as well as tunables) from
> >   userspace but in the long term, something better will be needed
> 
> Right, so as a driver writer, one thing I kind of hate is having to fill
> a netlink skb myself because that's a lot of work, and it makes it
> fairly hard to do mass conversion of internal APIs due to the increased
> complexity in auditing what drivers do.

I would also prefer to avoid parsing and generating netlink messages in
drivers for regular operations.

Michal