[next-queue,RFC,0/4] ethtool: Add support for frame preemption

Message ID	20200516012948.3173993-1-vinicius.gomes@intel.com
Headers	show Return-Path: <intel-wired-lan-bounces@osuosl.org> IronPort-SDR: Uuf+0V6EK82EVWSkinrZVHMcneHh9uDN7Wj+wn93LkVRgjrcBDr0bvyeRPhGQ25y2ouua8VD9F rZx7/kFF3yFg== IronPort-SDR: c/1wXf/KexM6fjgEE6TwkFvVC4jFoJ/IJtxuNnt9d9ZaAAZKjpTyHuTRMebdkaI2OxJ3j101kr JEF3iufYlTnQ== From: Vinicius Costa Gomes <vinicius.gomes@intel.com> To: intel-wired-lan@lists.osuosl.org Date: Fri, 15 May 2020 18:29:44 -0700 Message-Id: <20200516012948.3173993-1-vinicius.gomes@intel.com> MIME-Version: 1.0 Subject: [Intel-wired-lan] [next-queue RFC 0/4] ethtool: Add support for frame preemption Precedence: list Cc: Jose.Abreu@synopsys.com, netdev@vger.kernel.org, po.liu@nxp.com, m-karicheri2@ti.com, vladimir.oltean@nxp.com Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-wired-lan-bounces@osuosl.org Sender: "Intel-wired-lan" <intel-wired-lan-bounces@osuosl.org>
Series	ethtool: Add support for frame preemption \| expand [next-queue,RFC,0/4] ethtool: Add support for frame preemption [next-queue,RFC,1/4] ethtool: Add support for configuring frame preemption [next-queue,RFC,2/4] ethtool: Add support for configuring frame preemption via netlink [next-queue,RFC,3/4] igc: Add support for configuring frame preemption [next-queue,RFC,4/4] igc: Add support for exposing frame preemption stats registers

Vinicius Costa Gomes May 16, 2020, 1:29 a.m. UTC

Hi,

This series adds support for configuring frame preemption, as defined
by IEEE 802.1Q-2018 (previously IEEE 802.1Qbu) and IEEE 802.3br.

Frame preemption allows a packet from a higher priority queue marked
as "express" to preempt a packet from lower priority queue marked as
"preemptible". The idea is that this can help reduce the latency for
higher priority traffic.

Previously, the proposed interface for configuring these features was
using the qdisc layer. But as this is very hardware dependent and all
that qdisc did was pass the information to the driver, it makes sense
to have this in ethtool.

One example, for retrieving and setting the configuration:

$ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
Frame preemption settings for enp3s0:
	support: supported
	active: active
	supported queues: 0xf
	supported queues: 0xe
	minimum fragment size: 68


$ ethtool --set-frame-preemption enp3s0 fp on min-frag-size 68 preemptible-queues-mask 0xe

This is a RFC because I wanted to have feedback on some points:

  - The parameters added are enough for the hardware I have, is it
    enough in general?

  - even with the ethtool via netlink effort, I chose to keep the
    ioctl() way, in case someone wants to backport this to an older
    kernel, is there a problem with this?

  - Some space for bikeshedding the names and location (for example,
    does it make sense for these settings to be per-queue?), as I am
    not quite happy with them, one example, is the use of preemptible
    vs. preemptable;


About the patches, should be quite straightforward:

Patch 1, adds the ETHTOOL_GFP and ETHOOL_SFP commands and the
associated data structures;

Patch 2, adds the ETHTOOL_MSG_PREEMPT_GET and ETHTOOL_MSG_PREEMPT_SET
netlink messages and the associated attributes;

Patch 3, is the example implementation for the igc driver, the catch
here is that frame preemption in igc is dependent on the TSN "mode"
being enabled;

Patch 4, adds some registers that helped during implementation.

Another thing is that because igc is still under development, to avoid
conflicts, I think it might be easier for the PATCH version of this
series to go through Jeff Kirsher's tree.

Vinicius Costa Gomes (4):
  ethtool: Add support for configuring frame preemption
  ethtool: Add support for configuring frame preemption via netlink
  igc: Add support for configuring frame preemption
  igc: Add support for exposing frame preemption stats registers

 drivers/net/ethernet/intel/igc/igc.h         |   3 +
 drivers/net/ethernet/intel/igc/igc_defines.h |   6 +
 drivers/net/ethernet/intel/igc/igc_ethtool.c |  77 ++++++++
 drivers/net/ethernet/intel/igc/igc_regs.h    |  10 +
 drivers/net/ethernet/intel/igc/igc_tsn.c     |  46 ++++-
 include/linux/ethtool.h                      |   6 +
 include/uapi/linux/ethtool.h                 |  25 +++
 include/uapi/linux/ethtool_netlink.h         |  19 ++
 net/ethtool/Makefile                         |   3 +-
 net/ethtool/ioctl.c                          |  36 ++++
 net/ethtool/netlink.c                        |  15 ++
 net/ethtool/netlink.h                        |   2 +
 net/ethtool/preempt.c                        | 181 +++++++++++++++++++
 13 files changed, 423 insertions(+), 6 deletions(-)
 create mode 100644 net/ethtool/preempt.c

Michal Kubecek May 16, 2020, 9:33 a.m. UTC | #1

On Fri, May 15, 2020 at 06:29:44PM -0700, Vinicius Costa Gomes wrote:
> Hi,
> 
> This series adds support for configuring frame preemption, as defined
> by IEEE 802.1Q-2018 (previously IEEE 802.1Qbu) and IEEE 802.3br.
> 
> Frame preemption allows a packet from a higher priority queue marked
> as "express" to preempt a packet from lower priority queue marked as
> "preemptible". The idea is that this can help reduce the latency for
> higher priority traffic.
> 
> Previously, the proposed interface for configuring these features was
> using the qdisc layer. But as this is very hardware dependent and all
> that qdisc did was pass the information to the driver, it makes sense
> to have this in ethtool.
> 
> One example, for retrieving and setting the configuration:
> 
> $ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
> Frame preemption settings for enp3s0:
> 	support: supported

IMHO we don't need a special bool for this. IIUC this is not a state
flag that would change value for a particular device; either the device
supports the feature or it does not. If it does not, the ethtool_ops
callbacks would return -EOPNOTSUPP (or would not even exist if the
driver has no support) and ethtool would say so.

> 	active: active
> 	supported queues: 0xf
> 	supported queues: 0xe
> 	minimum fragment size: 68
> 
> 
> $ ethtool --set-frame-preemption enp3s0 fp on min-frag-size 68 preemptible-queues-mask 0xe
> 
> This is a RFC because I wanted to have feedback on some points:
> 
>   - The parameters added are enough for the hardware I have, is it
>     enough in general?
> 
>   - even with the ethtool via netlink effort, I chose to keep the
>     ioctl() way, in case someone wants to backport this to an older
>     kernel, is there a problem with this?

I would prefer not extending ioctl interface with new features, with
obvious exceptions like adding new link modes or so. Not only because
having new features only available through netlink will motivate authors
of userspace tools to support netlink but mostly because the lack of
flexibility and extensibility of ioctl interface inevitably leads to
compromises you wouldn't have to do if you only implement netlink
requests.

One example I can see is the use of u32 for queue bitmaps. Perhaps you
don't expect this feature to be supported on devices with more than 32
queues (and I don't have enough expertise to tell if it's justified at
the moment) but can you be sure it will be the case in 10 or 20 years?
As long as these hardcoded u32 bitmaps are only part of internal kernel
API (ethtool_ops), extending the support for bigger devices will mean
some code churn (possibly large if many drivers implement the feature)
but it's something that can be done. But if you have this limit in
userspace API, you are in a much bigger trouble. The same can be said
for adding new attributes - easy with netlink but with ioctl you never
know if those reserved fields will suffice.

> 
>   - Some space for bikeshedding the names and location (for example,
>     does it make sense for these settings to be per-queue?), as I am
>     not quite happy with them, one example, is the use of preemptible
>     vs. preemptable;
> 
> 
> About the patches, should be quite straightforward:
> 
> Patch 1, adds the ETHTOOL_GFP and ETHOOL_SFP commands and the
> associated data structures;
> 
> Patch 2, adds the ETHTOOL_MSG_PREEMPT_GET and ETHTOOL_MSG_PREEMPT_SET
> netlink messages and the associated attributes;

I didn't look too deeply but one thing I noticed is that setting the
parameters using ioctl() does not trigger netlink notification. If we
decide to implement ioctl support (and I'm not a fan of that), the
notifications should be sent even when ioctl is used.

Michal

David Miller May 16, 2020, 8:37 p.m. UTC | #2

From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Fri, 15 May 2020 18:29:44 -0700

> This series adds support for configuring frame preemption, as defined
> by IEEE 802.1Q-2018 (previously IEEE 802.1Qbu) and IEEE 802.3br.
> 
> Frame preemption allows a packet from a higher priority queue marked
> as "express" to preempt a packet from lower priority queue marked as
> "preemptible". The idea is that this can help reduce the latency for
> higher priority traffic.

Why do we need yet another name for something which is just basic
traffic prioritization and why is this configured via ethtool instead
of the "traffic classifier" which is where all of this stuff should
be done?

Vladimir Oltean May 16, 2020, 9:03 p.m. UTC | #3

Hi David,

On Sat, 16 May 2020 at 23:39, David Miller <davem@davemloft.net> wrote:
>
> From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> Date: Fri, 15 May 2020 18:29:44 -0700
>
> > This series adds support for configuring frame preemption, as defined
> > by IEEE 802.1Q-2018 (previously IEEE 802.1Qbu) and IEEE 802.3br.
> >
> > Frame preemption allows a packet from a higher priority queue marked
> > as "express" to preempt a packet from lower priority queue marked as
> > "preemptible". The idea is that this can help reduce the latency for
> > higher priority traffic.
>
> Why do we need yet another name for something which is just basic
> traffic prioritization and why is this configured via ethtool instead
> of the "traffic classifier" which is where all of this stuff should
> be done?

It is not 'just another name for basic traffic prioritization'. With
basic traffic prioritization only, a high-priority packet still has to
wait in the egress queue of a switch until a (potentially large)
low-priority packet has finished transmission and has freed the
medium. Frame preemption changes that. Actually it requires hardware
support on both ends, because the way it is transmitted on the wire is
not compatible with regular Ethernet frames (it uses a special Start
Of Frame Delimiter to encode preemptible traffic).
I know we are talking about ridiculously low improvements in latency,
but the background is that Ethernet is making its way into the
industrial and process control fields, and for that type of
application you need to ensure minimally low and maximally consistent
end-to-end latencies. Frame preemption helps with the "minimally low"
part. The way it works is that typically there are 2 MACs per
interface (1 is "express" - equivalent to the legacy type, and the
other is "preemptible" - the new type) and this new IEEE 802.1Q clause
thing allows some arbitration/preemption event to happen between the
two MACs. When a preemption event happens, the preemptible MAC quickly
wraps up and aborts the frame it's currently transmitting (to come
back and continue later), making room for the express MAC to do its
thing because it's time-constrained. Then, after the express MAC
finishes, the preemptible MAC continues with the rest of the frame
fragments from where it was preempted.
As to why this doesn't go to tc but to ethtool: why would it go to tc?
You can't emulate such behavior in software. It's a hardware feature.
You only* (more or less) need to specify which traffic classes on a
port go to the preemptible MAC and which go to the express MAC. We
discussed about the possibility of extending tc-taprio to configure
frame preemption through it, but the consensus was that somebody might
want to use frame preemption as a standalone feature, without
scheduled traffic, and that inventing another qdisc for frame
preemption alone would be too much of a formalism. (I hope I didn't
omit anything important from the previous discussion on the topic)

Thanks,
-Vladimir

David Miller May 16, 2020, 10:19 p.m. UTC | #4

From: Vladimir Oltean <olteanv@gmail.com>
Date: Sun, 17 May 2020 00:03:39 +0300

> As to why this doesn't go to tc but to ethtool: why would it go to tc?

Maybe you can't %100 duplicate the on-the-wire special format and
whatever, but the queueing behavior ABSOLUTELY you can emulate in
software.

And then you have the proper hooks added for HW offload which can
do the on-the-wire stuff.

That's how we do these things, not with bolted on ethtool stuff.

Vladimir Oltean May 17, 2020, 10:51 a.m. UTC | #5

On Sun, 17 May 2020 at 01:19, David Miller <davem@davemloft.net> wrote:
>
> From: Vladimir Oltean <olteanv@gmail.com>
> Date: Sun, 17 May 2020 00:03:39 +0300
>
> > As to why this doesn't go to tc but to ethtool: why would it go to tc?
>
> Maybe you can't %100 duplicate the on-the-wire special format and
> whatever, but the queueing behavior ABSOLUTELY you can emulate in
> software.
>
> And then you have the proper hooks added for HW offload which can
> do the on-the-wire stuff.
>
> That's how we do these things, not with bolted on ethtool stuff.

When talking about frame preemption in the way that it is defined in
802.1Qbu and 802.3br, it says or assumes nothing about queuing. It
describes the fact that there are 2 MACs per interface, 1 MAC dealing
with some traffic classes and the other dealing with the rest. Queuing
sits completely above the layer where frame preemption happens:
- The queuing layer does not care if packets go to a traffic class
that is serviced by a preemptible MAC or an express MAC
- The MAC does not care by what means have packets been classified to
one traffic class or another.
I have no idea what emulation of queuing behavior you are talking
about. Frame preemption is a MAC hardware feature. Your NIC supports
it or it doesn't. Which means it can talk to a link partner that
supports frame preemption, or it can't.

Thanks,
-Vladimir

Andrew Lunn May 17, 2020, 6:45 p.m. UTC | #6

On Sun, May 17, 2020 at 01:51:19PM +0300, Vladimir Oltean wrote:
> On Sun, 17 May 2020 at 01:19, David Miller <davem@davemloft.net> wrote:
> >
> > From: Vladimir Oltean <olteanv@gmail.com>
> > Date: Sun, 17 May 2020 00:03:39 +0300
> >
> > > As to why this doesn't go to tc but to ethtool: why would it go to tc?
> >
> > Maybe you can't %100 duplicate the on-the-wire special format and
> > whatever, but the queueing behavior ABSOLUTELY you can emulate in
> > software.
> >
> > And then you have the proper hooks added for HW offload which can
> > do the on-the-wire stuff.
> >
> > That's how we do these things, not with bolted on ethtool stuff.
> 
> When talking about frame preemption in the way that it is defined in
> 802.1Qbu and 802.3br, it says or assumes nothing about queuing. It
> describes the fact that there are 2 MACs per interface, 1 MAC dealing
> with some traffic classes and the other dealing with the rest.

I did not follow the previous discussion, but i assume you talked
about modelling it in Linux as two MACs? Why was that approach not
followed?

   Andrew

Vladimir Oltean May 17, 2020, 7:04 p.m. UTC | #7

Hi Andrew,

On Sun, 17 May 2020 at 21:45, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sun, May 17, 2020 at 01:51:19PM +0300, Vladimir Oltean wrote:
> > On Sun, 17 May 2020 at 01:19, David Miller <davem@davemloft.net> wrote:
> > >
> > > From: Vladimir Oltean <olteanv@gmail.com>
> > > Date: Sun, 17 May 2020 00:03:39 +0300
> > >
> > > > As to why this doesn't go to tc but to ethtool: why would it go to tc?
> > >
> > > Maybe you can't %100 duplicate the on-the-wire special format and
> > > whatever, but the queueing behavior ABSOLUTELY you can emulate in
> > > software.
> > >
> > > And then you have the proper hooks added for HW offload which can
> > > do the on-the-wire stuff.
> > >
> > > That's how we do these things, not with bolted on ethtool stuff.
> >
> > When talking about frame preemption in the way that it is defined in
> > 802.1Qbu and 802.3br, it says or assumes nothing about queuing. It
> > describes the fact that there are 2 MACs per interface, 1 MAC dealing
> > with some traffic classes and the other dealing with the rest.
>
> I did not follow the previous discussion, but i assume you talked
> about modelling it in Linux as two MACs? Why was that approach not
> followed?
>
>    Andrew

I don't recall having discussed that option, but I guess that you can
see why, if I quote this portion from IEEE 802.1Q-2018:

6.7.1 Support of the ISS by IEEE Std 802.3 (Ethernet)
Mapping between M_CONTROL.requests/indications and IEEE 802.3
MA_CONTROL.requests/indications is performed as specified in IEEE Std
802.1AC. If the MAC supports IEEE 802.3br (TM) Interspersing Express
Traffic, then PFC M_CONTROL.requests are mapped onto the MAC control
interface associated with the express MAC (eMAC).
If frame preemption (6.7.2) is supported on a Port, then the IEEE
802.3 MAC provides the following two MAC service interfaces (99.4 of
IEEE Std 802.3br (TM)-2016 [B21]):
a) A preemptible MAC (pMAC) service interface
b) An express MAC (eMAC) service interface
For priority values that are identified in the frame preemption status
table (6.7.2) as preemptible, frames that are selected for
transmission shall be transmitted using the pMAC service instance, and
for priority values that are identified in the frame preemption status
table as express, frames that are selected for transmission shall be
transmitted using the eMAC service instance.

_In all other respects, the Port behaves as if it is supported by a
single MAC service interface_ (emphasis mine). In particular, all
frames received by the Port are treated as if they were received on a
single MAC service interface regardless of whether they were received
on the eMAC service interface or the pMAC service interface, except
with respect to frame preemption.

Thanks,
-Vladimir

Vinicius Costa Gomes May 18, 2020, 7:05 p.m. UTC | #8

Hi,

David Miller <davem@davemloft.net> writes:

> From: Vladimir Oltean <olteanv@gmail.com>
> Date: Sun, 17 May 2020 00:03:39 +0300
>
>> As to why this doesn't go to tc but to ethtool: why would it go to tc?
>
> Maybe you can't %100 duplicate the on-the-wire special format and
> whatever, but the queueing behavior ABSOLUTELY you can emulate in
> software.

Just saying what Vladimir said in different words: the queueing behavior
is already implemented in software, by mqprio or taprio, for example.

That is to say, if we add frame preemption support to those qdiscs all
they will do is pass the information to the driver, and that's it. They
won't be able to use that information at all.

The mental model I have for this feature is that is more similar to the
segmentation offloads, energy efficient ethernet or auto-negotiation
than it is to a traffic shaper like CBS.

>
> And then you have the proper hooks added for HW offload which can
> do the on-the-wire stuff.
>
> That's how we do these things, not with bolted on ethtool stuff.

Cheers,

Vinicius Costa Gomes May 18, 2020, 7:34 p.m. UTC | #9

Hi,

Michal Kubecek <mkubecek@suse.cz> writes:

> On Fri, May 15, 2020 at 06:29:44PM -0700, Vinicius Costa Gomes wrote:
>> Hi,
>> 
>> This series adds support for configuring frame preemption, as defined
>> by IEEE 802.1Q-2018 (previously IEEE 802.1Qbu) and IEEE 802.3br.
>> 
>> Frame preemption allows a packet from a higher priority queue marked
>> as "express" to preempt a packet from lower priority queue marked as
>> "preemptible". The idea is that this can help reduce the latency for
>> higher priority traffic.
>> 
>> Previously, the proposed interface for configuring these features was
>> using the qdisc layer. But as this is very hardware dependent and all
>> that qdisc did was pass the information to the driver, it makes sense
>> to have this in ethtool.
>> 
>> One example, for retrieving and setting the configuration:
>> 
>> $ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
>> Frame preemption settings for enp3s0:
>> 	support: supported
>
> IMHO we don't need a special bool for this. IIUC this is not a state
> flag that would change value for a particular device; either the device
> supports the feature or it does not. If it does not, the ethtool_ops
> callbacks would return -EOPNOTSUPP (or would not even exist if the
> driver has no support) and ethtool would say so.

(I know that the comments below only apply if "ethtool-way" is what's
decided)

Cool. Will remove the supported bit.

>
>> 	active: active
>> 	supported queues: 0xf
>> 	supported queues: 0xe
>> 	minimum fragment size: 68
>> 
>> 
>> $ ethtool --set-frame-preemption enp3s0 fp on min-frag-size 68 preemptible-queues-mask 0xe
>> 
>> This is a RFC because I wanted to have feedback on some points:
>> 
>>   - The parameters added are enough for the hardware I have, is it
>>     enough in general?
>> 
>>   - even with the ethtool via netlink effort, I chose to keep the
>>     ioctl() way, in case someone wants to backport this to an older
>>     kernel, is there a problem with this?
>
> I would prefer not extending ioctl interface with new features, with
> obvious exceptions like adding new link modes or so. Not only because
> having new features only available through netlink will motivate authors
> of userspace tools to support netlink but mostly because the lack of
> flexibility and extensibility of ioctl interface inevitably leads to
> compromises you wouldn't have to do if you only implement netlink
> requests.

Agreed. Will send the next version with only the netlink interface, and
let's see who complains.

>
> One example I can see is the use of u32 for queue bitmaps. Perhaps you
> don't expect this feature to be supported on devices with more than 32
> queues (and I don't have enough expertise to tell if it's justified at
> the moment) but can you be sure it will be the case in 10 or 20 years?
> As long as these hardcoded u32 bitmaps are only part of internal kernel
> API (ethtool_ops), extending the support for bigger devices will mean
> some code churn (possibly large if many drivers implement the feature)
> but it's something that can be done. But if you have this limit in
> userspace API, you are in a much bigger trouble. The same can be said
> for adding new attributes - easy with netlink but with ioctl you never
> know if those reserved fields will suffice.

A bit of background for this decision (using u32), frame preemption has
dimishing returns in relation to link speeds, my gut feeling is that for
links faster than 2.5G it stops making sense, at least in Linux, the
measurement noise will hide any latency improvement brought by frame
preemption. And I don't see many 2.5G NICs supporting more than 32
queues.

But I agree that keeping the interface future proof is better. Will
change to expose the queues configuration as bitset.

>
>> 
>>   - Some space for bikeshedding the names and location (for example,
>>     does it make sense for these settings to be per-queue?), as I am
>>     not quite happy with them, one example, is the use of preemptible
>>     vs. preemptable;
>> 
>> 
>> About the patches, should be quite straightforward:
>> 
>> Patch 1, adds the ETHTOOL_GFP and ETHOOL_SFP commands and the
>> associated data structures;
>> 
>> Patch 2, adds the ETHTOOL_MSG_PREEMPT_GET and ETHTOOL_MSG_PREEMPT_SET
>> netlink messages and the associated attributes;
>
> I didn't look too deeply but one thing I noticed is that setting the
> parameters using ioctl() does not trigger netlink notification. If we
> decide to implement ioctl support (and I'm not a fan of that), the
> notifications should be sent even when ioctl is used.

Oh, yeah, that's right. Nice catch.


Cheers,

Jakub Kicinski May 18, 2020, 8:56 p.m. UTC | #10

On Mon, 18 May 2020 12:05:04 -0700 Vinicius Costa Gomes wrote:
> David Miller <davem@davemloft.net> writes:
> > From: Vladimir Oltean <olteanv@gmail.com>
> > Date: Sun, 17 May 2020 00:03:39 +0300
> >  
> >> As to why this doesn't go to tc but to ethtool: why would it go to tc?  
> >
> > Maybe you can't %100 duplicate the on-the-wire special format and
> > whatever, but the queueing behavior ABSOLUTELY you can emulate in
> > software.  
> 
> Just saying what Vladimir said in different words: the queueing behavior
> is already implemented in software, by mqprio or taprio, for example.
> 
> That is to say, if we add frame preemption support to those qdiscs all
> they will do is pass the information to the driver, and that's it. They
> won't be able to use that information at all.
> 
> The mental model I have for this feature is that is more similar to the
> segmentation offloads, energy efficient ethernet or auto-negotiation
> than it is to a traffic shaper like CBS.

Please take a look at the example from the cover letter:

$ ethtool $ sudo ./ethtool --show-frame-preemption
enp3s0 Frame preemption settings for enp3s0:
	support: supported
	active: active
	supported queues: 0xf
	supported queues: 0xe
	minimum fragment size: 68

Reading this I have no idea what 0xe is. I have to go and query TC API
to see what priorities and queues that will be. Which IMHO is a strong
argument that this information belongs there in the first place.

Vinicius Costa Gomes May 18, 2020, 10:06 p.m. UTC | #11

Hi,

Jakub Kicinski <kuba@kernel.org> writes:
>
> Please take a look at the example from the cover letter:
>
> $ ethtool $ sudo ./ethtool --show-frame-preemption
> enp3s0 Frame preemption settings for enp3s0:
> 	support: supported
> 	active: active
> 	supported queues: 0xf
> 	supported queues: 0xe
> 	minimum fragment size: 68
>
> Reading this I have no idea what 0xe is. I have to go and query TC API
> to see what priorities and queues that will be. Which IMHO is a strong
> argument that this information belongs there in the first place.

That was the (only?) strong argument in favor of having frame preemption
in the TC side when this was last discussed.

We can have a hybrid solution, we can move the express/preemptible per
queue map to mqprio/taprio/whatever. And have the more specific
configuration knobs, minimum fragment size, etc, in ethtool.

What do you think?

Cheers,

Jakub Kicinski May 18, 2020, 10:22 p.m. UTC | #12

On Mon, 18 May 2020 15:06:26 -0700 Vinicius Costa Gomes wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> >
> > Please take a look at the example from the cover letter:
> >
> > $ ethtool $ sudo ./ethtool --show-frame-preemption
> > enp3s0 Frame preemption settings for enp3s0:
> > 	support: supported
> > 	active: active
> > 	supported queues: 0xf
> > 	supported queues: 0xe
> > 	minimum fragment size: 68
> >
> > Reading this I have no idea what 0xe is. I have to go and query TC API
> > to see what priorities and queues that will be. Which IMHO is a strong
> > argument that this information belongs there in the first place.  
> 
> That was the (only?) strong argument in favor of having frame preemption
> in the TC side when this was last discussed.
> 
> We can have a hybrid solution, we can move the express/preemptible per
> queue map to mqprio/taprio/whatever. And have the more specific
> configuration knobs, minimum fragment size, etc, in ethtool.
> 
> What do you think?

Does the standard specify minimum fragment size as a global MAC setting?

Vinicius Costa Gomes May 18, 2020, 11:05 p.m. UTC | #13

Jakub Kicinski <kuba@kernel.org> writes:

>> That was the (only?) strong argument in favor of having frame preemption
>> in the TC side when this was last discussed.
>> 
>> We can have a hybrid solution, we can move the express/preemptible per
>> queue map to mqprio/taprio/whatever. And have the more specific
>> configuration knobs, minimum fragment size, etc, in ethtool.
>> 
>> What do you think?
>
> Does the standard specify minimum fragment size as a global MAC setting?

Yes, it's a per-MAC setting, not per-queue.

Jakub Kicinski May 18, 2020, 11:09 p.m. UTC | #14

On Mon, 18 May 2020 16:05:08 -0700 Vinicius Costa Gomes wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> >> That was the (only?) strong argument in favor of having frame preemption
> >> in the TC side when this was last discussed.
> >> 
> >> We can have a hybrid solution, we can move the express/preemptible per
> >> queue map to mqprio/taprio/whatever. And have the more specific
> >> configuration knobs, minimum fragment size, etc, in ethtool.
> >> 
> >> What do you think?  
> >
> > Does the standard specify minimum fragment size as a global MAC setting?  
> 
> Yes, it's a per-MAC setting, not per-queue. 

If standard defines it as per-MAC and we can reasonably expect vendors
won't try to "add value" and make it per queue (unlikely here AFAIU),
then for this part ethtool configuration seems okay to me.

Murali Karicheri May 19, 2020, 2:53 p.m. UTC | #15

Hi Vinicius,

On 5/15/20 9:29 PM, Vinicius Costa Gomes wrote:
> Hi,
> 
> This series adds support for configuring frame preemption, as defined
> by IEEE 802.1Q-2018 (previously IEEE 802.1Qbu) and IEEE 802.3br.
> 
> Frame preemption allows a packet from a higher priority queue marked
> as "express" to preempt a packet from lower priority queue marked as
> "preemptible". The idea is that this can help reduce the latency for
> higher priority traffic.
> 
> Previously, the proposed interface for configuring these features was
> using the qdisc layer. But as this is very hardware dependent and all
> that qdisc did was pass the information to the driver, it makes sense
> to have this in ethtool.
> 
> One example, for retrieving and setting the configuration:
> 
> $ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
> Frame preemption settings for enp3s0:
> 	support: supported
> 	active: active
> 	supported queues: 0xf

I assume this is will be in sync with ethtool -L output which indicates
how many tx h/w queues present? I mean if there are 8 h/w queues,
supported queues will show 0xff.

> 	supported queues: 0xe
 From the command below, it appears this is the preemptible queue mask.
bit 0  is Q0, bit 1 Q1 and so forth. Right? In that case isn't it more
clear to display
         preemptible queues : 0xef

In the above Q7 is express queue and Q6-Q0 are preemptible.

Also there is a handshake called verify that happens which initiated
by the h/w to check the capability of peer. It looks like
not all vendor's hardware supports it and good to have it displayed
something like

         Verify supported/{not supported}

If Verify is supported, FPE is enabled only if it succeeds. So may be
good to show a status of Verify if it is supported something like
         Verify success/Failed

> 	minimum fragment size: 68
> 
> 
> $ ethtool --set-frame-preemption enp3s0 fp on min-frag-size 68 preemptible-queues-mask 0xe
> 
> This is a RFC because I wanted to have feedback on some points:
> 
>    - The parameters added are enough for the hardware I have, is it
>      enough in general?

As described above, it would be good to add an optional parameter for
verify

ethtool --set-frame-preemption enp3s0 fp on min-frag-size 68 
preemptible-queues-mask 0xe verify on

> 
>    - even with the ethtool via netlink effort, I chose to keep the
>      ioctl() way, in case someone wants to backport this to an older
>      kernel, is there a problem with this?
> 
>    - Some space for bikeshedding the names and location (for example,
>      does it make sense for these settings to be per-queue?), as I am
>      not quite happy with them, one example, is the use of preemptible
>      vs. preemptable;
> 
> 
> About the patches, should be quite straightforward:
> 
> Patch 1, adds the ETHTOOL_GFP and ETHOOL_SFP commands and the
> associated data structures;
> 
> Patch 2, adds the ETHTOOL_MSG_PREEMPT_GET and ETHTOOL_MSG_PREEMPT_SET
> netlink messages and the associated attributes;
> 
> Patch 3, is the example implementation for the igc driver, the catch
> here is that frame preemption in igc is dependent on the TSN "mode"
> being enabled;
> 
> Patch 4, adds some registers that helped during implementation.
> 
> Another thing is that because igc is still under development, to avoid
> conflicts, I think it might be easier for the PATCH version of this
> series to go through Jeff Kirsher's tree.
> 
> Vinicius Costa Gomes (4):
>    ethtool: Add support for configuring frame preemption
>    ethtool: Add support for configuring frame preemption via netlink
>    igc: Add support for configuring frame preemption
>    igc: Add support for exposing frame preemption stats registers
> 
>   drivers/net/ethernet/intel/igc/igc.h         |   3 +
>   drivers/net/ethernet/intel/igc/igc_defines.h |   6 +
>   drivers/net/ethernet/intel/igc/igc_ethtool.c |  77 ++++++++
>   drivers/net/ethernet/intel/igc/igc_regs.h    |  10 +
>   drivers/net/ethernet/intel/igc/igc_tsn.c     |  46 ++++-
>   include/linux/ethtool.h                      |   6 +
>   include/uapi/linux/ethtool.h                 |  25 +++
>   include/uapi/linux/ethtool_netlink.h         |  19 ++
>   net/ethtool/Makefile                         |   3 +-
>   net/ethtool/ioctl.c                          |  36 ++++
>   net/ethtool/netlink.c                        |  15 ++
>   net/ethtool/netlink.h                        |   2 +
>   net/ethtool/preempt.c                        | 181 +++++++++++++++++++
>   13 files changed, 423 insertions(+), 6 deletions(-)
>   create mode 100644 net/ethtool/preempt.c
>

Vinicius Costa Gomes May 19, 2020, 3:32 p.m. UTC | #16

Murali Karicheri <m-karicheri2@ti.com> writes:

>> $ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
>> Frame preemption settings for enp3s0:
>> 	support: supported
>> 	active: active
>> 	supported queues: 0xf
>
> I assume this is will be in sync with ethtool -L output which indicates
> how many tx h/w queues present? I mean if there are 8 h/w queues,
> supported queues will show 0xff.

In this approach, the driver builds these bitmasks, so it's responsible
to keep it consistent with the rest of the stuff that's exposed in
ethtool.

>
>> 	supported queues: 0xe
>  From the command below, it appears this is the preemptible queue mask.
> bit 0  is Q0, bit 1 Q1 and so forth. Right? In that case isn't it more
> clear to display
>          preemptible queues : 0xef
>
> In the above Q7 is express queue and Q6-Q0 are preemptible.

In my case, the controller I have here only has 4 queues, and Queue 0 is
the highest priority one, and it's marked as express.

>
> Also there is a handshake called verify that happens which initiated
> by the h/w to check the capability of peer. It looks like
> not all vendor's hardware supports it and good to have it displayed
> something like
>
>          Verify supported/{not supported}
>
> If Verify is supported, FPE is enabled only if it succeeds. So may be
> good to show a status of Verify if it is supported something like
>          Verify success/Failed
>
>> 	minimum fragment size: 68
>> 
>> 
>> $ ethtool --set-frame-preemption enp3s0 fp on min-frag-size 68 preemptible-queues-mask 0xe
>> 
>> This is a RFC because I wanted to have feedback on some points:
>> 
>>    - The parameters added are enough for the hardware I have, is it
>>      enough in general?
>
> As described above, it would be good to add an optional parameter for
> verify
>
> ethtool --set-frame-preemption enp3s0 fp on min-frag-size 68 
> preemptible-queues-mask 0xe verify on
>

The hardware I have do not support this, but this makes sense.


Cheers,

Murali Karicheri May 19, 2020, 4:11 p.m. UTC | #17

Hi Vinicius,

On 5/19/20 11:32 AM, Vinicius Costa Gomes wrote:
> Murali Karicheri <m-karicheri2@ti.com> writes:
> 
>>> $ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
>>> Frame preemption settings for enp3s0:
>>> 	support: supported
>>> 	active: active
>>> 	supported queues: 0xf
>>
>> I assume this is will be in sync with ethtool -L output which indicates
>> how many tx h/w queues present? I mean if there are 8 h/w queues,
>> supported queues will show 0xff.
> 
> In this approach, the driver builds these bitmasks, so it's responsible
> to keep it consistent with the rest of the stuff that's exposed in
> ethtool.
Agree
> 
>>
>>> 	supported queues: 0xe
>>   From the command below, it appears this is the preemptible queue mask.
>> bit 0  is Q0, bit 1 Q1 and so forth. Right? In that case isn't it more
>> clear to display
>>           preemptible queues : 0xef
>>
>> In the above Q7 is express queue and Q6-Q0 are preemptible.
> 
> In my case, the controller I have here only has 4 queues, and Queue 0 is
> the highest priority one, and it's marked as express.
> 

Ok. So it is up to the device driver to manage this.

>>
>> Also there is a handshake called verify that happens which initiated
>> by the h/w to check the capability of peer. It looks like
>> not all vendor's hardware supports it and good to have it displayed
>> something like
>>
>>           Verify supported/{not supported}
>>
>> If Verify is supported, FPE is enabled only if it succeeds. So may be
>> good to show a status of Verify if it is supported something like
>>           Verify success/Failed
>>
>>> 	minimum fragment size: 68
>>>
>>>
>>> $ ethtool --set-frame-preemption enp3s0 fp on min-frag-size 68 preemptible-queues-mask 0xe
>>>
>>> This is a RFC because I wanted to have feedback on some points:
>>>
>>>     - The parameters added are enough for the hardware I have, is it
>>>       enough in general?
>>
>> As described above, it would be good to add an optional parameter for
>> verify
>>
>> ethtool --set-frame-preemption enp3s0 fp on min-frag-size 68
>> preemptible-queues-mask 0xe verify on
>>
> 
> The hardware I have do not support this, but this makes sense.

I can work to support this for TI AM65 CPSW once a formal patch is
available. AM65 CPSW supports Verify as an optional feature. H/w works
also in manual mode where it is assumed that it is connected to a
IET FPE capable partner, but can't do Verify.

Thanks for sending the RFC.

regards,

Murali
> 
> 
> Cheers,
>

Murali Karicheri May 19, 2020, 4:34 p.m. UTC | #18

On 5/18/20 6:06 PM, Vinicius Costa Gomes wrote:
> Hi,
> 
> Jakub Kicinski <kuba@kernel.org> writes:
>>
>> Please take a look at the example from the cover letter:
>>
>> $ ethtool $ sudo ./ethtool --show-frame-preemption
>> enp3s0 Frame preemption settings for enp3s0:
>> 	support: supported
>> 	active: active
>> 	supported queues: 0xf
>> 	supported queues: 0xe
>> 	minimum fragment size: 68
>>
>> Reading this I have no idea what 0xe is. I have to go and query TC API
>> to see what priorities and queues that will be. Which IMHO is a strong
>> argument that this information belongs there in the first place.
> 
> That was the (only?) strong argument in favor of having frame preemption
> in the TC side when this was last discussed.
> 
> We can have a hybrid solution, we can move the express/preemptible per
> queue map to mqprio/taprio/whatever. And have the more specific
> configuration knobs, minimum fragment size, etc, in ethtool.

Isn't this a pure h/w feature? FPE is implemented at L2 and involves
fragments that are only seen by h/w and never at Linux network core
unlike IP fragments and is transparent to network stack. However it
enhances priority handling at h/w to the next level by pre-empting 
existing lower priority traffic to give way to express queue traffic
and improve latency. So everything happens in h/w. So ethtool makes
perfect sense here as it is a queue configuration. I agree with Vinicius
and Vladmir to support this in ethtool instead of TC.

Murali
> 
> What do you think?
> 
> 
> Cheers,
>

Vinicius Costa Gomes May 19, 2020, 5:49 p.m. UTC | #19

Murali Karicheri <m-karicheri2@ti.com> writes:

>> That was the (only?) strong argument in favor of having frame preemption
>> in the TC side when this was last discussed.
>> 
>> We can have a hybrid solution, we can move the express/preemptible per
>> queue map to mqprio/taprio/whatever. And have the more specific
>> configuration knobs, minimum fragment size, etc, in ethtool.
>
> Isn't this a pure h/w feature? FPE is implemented at L2 and involves
> fragments that are only seen by h/w and never at Linux network core
> unlike IP fragments and is transparent to network stack. However it
> enhances priority handling at h/w to the next level by pre-empting 
> existing lower priority traffic to give way to express queue traffic
> and improve latency. So everything happens in h/w. So ethtool makes
> perfect sense here as it is a queue configuration. I agree with Vinicius
> and Vladmir to support this in ethtool instead of TC.

The way I see, the issue that Jakub is pointing here is more of
usability/understandability.

By having the express/preemptible queue mapping in TC, we have the
configuration near where the "priority to queue" mapping happens. That
improves the ease of configuration, makes it easier to spot mistakes,
that kind of thing, all of which are a big plus.

Right now, I am seeing this hybrid approach as a good compromise, we
have the queue settings near to where the kinds of traffic are mapped to
queues, and we have the rest of the hardware configuration in ethtool.

Cheers,

Andre Guedes May 19, 2020, 10:39 p.m. UTC | #20

Hi,

Quoting Vinicius Costa Gomes (2020-05-15 18:29:44)
> One example, for retrieving and setting the configuration:
> 
> $ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
> Frame preemption settings for enp3s0:
>         support: supported
>         active: active

IIUC the code in patch 2, 'active' is the actual configuration knob that
enables or disables the FP functionality on the NIC.

That sounded a bit confusing to me since the spec uses the term 'active' to
indicate FP is currently enabled at both ends, and it is a read-only
information (see 12.30.1.4 from IEEE 802.1Q-2018). Maybe if we called this
'enabled' it would be more clear.

>         supported queues: 0xf
>         supported queues: 0xe
>         minimum fragment size: 68

I'm assuming this is the configuration knob for the minimal non-final fragment
defined in 802.3br.

My understanding from the specs is that this value must be a multiple from 64
and cannot assume arbitrary values like shown here. See 99.4.7.3 from IEEE
802.3 and Note 1 in S.2 from IEEE 802.1Q. In the previous discussion about FP,
we had this as a multiplier factor, not absolute value.

Regards,

Andre

Andre Guedes May 19, 2020, 10:40 p.m. UTC | #21

Hi,

Quoting Vinicius Costa Gomes (2020-05-18 12:34:22)
> Hi,
> 
> Michal Kubecek <mkubecek@suse.cz> writes:
> 
> > On Fri, May 15, 2020 at 06:29:44PM -0700, Vinicius Costa Gomes wrote:
> >> Hi,
> >> 
> >> This series adds support for configuring frame preemption, as defined
> >> by IEEE 802.1Q-2018 (previously IEEE 802.1Qbu) and IEEE 802.3br.
> >> 
> >> Frame preemption allows a packet from a higher priority queue marked
> >> as "express" to preempt a packet from lower priority queue marked as
> >> "preemptible". The idea is that this can help reduce the latency for
> >> higher priority traffic.
> >> 
> >> Previously, the proposed interface for configuring these features was
> >> using the qdisc layer. But as this is very hardware dependent and all
> >> that qdisc did was pass the information to the driver, it makes sense
> >> to have this in ethtool.
> >> 
> >> One example, for retrieving and setting the configuration:
> >> 
> >> $ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
> >> Frame preemption settings for enp3s0:
> >>      support: supported
> >
> > IMHO we don't need a special bool for this. IIUC this is not a state
> > flag that would change value for a particular device; either the device
> > supports the feature or it does not. If it does not, the ethtool_ops
> > callbacks would return -EOPNOTSUPP (or would not even exist if the
> > driver has no support) and ethtool would say so.
> 
> (I know that the comments below only apply if "ethtool-way" is what's
> decided)
> 
> Cool. Will remove the supported bit.
> 
> >
> >>      active: active
> >>      supported queues: 0xf

Following the same rationale, is this 'supported queue' going aways as well?

- Andre

Vinicius Costa Gomes May 19, 2020, 10:53 p.m. UTC | #22

Andre Guedes <andre.guedes@intel.com> writes:

>> >
>> >>      active: active
>> >>      supported queues: 0xf
>
> Following the same rationale, is this 'supported queue' going aways as well?
>

I think so, with good error messages, when trying to set an express-only
queue as preemptible, no need to expose this information.

Vinicius Costa Gomes May 19, 2020, 11:37 p.m. UTC | #23

Andre Guedes <andre.guedes@intel.com> writes:

> Hi,
>
> Quoting Vinicius Costa Gomes (2020-05-15 18:29:44)
>> One example, for retrieving and setting the configuration:
>> 
>> $ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
>> Frame preemption settings for enp3s0:
>>         support: supported
>>         active: active
>
> IIUC the code in patch 2, 'active' is the actual configuration knob that
> enables or disables the FP functionality on the NIC.
>
> That sounded a bit confusing to me since the spec uses the term 'active' to
> indicate FP is currently enabled at both ends, and it is a read-only
> information (see 12.30.1.4 from IEEE 802.1Q-2018). Maybe if we called this
> 'enabled' it would be more clear.

Good point. Will rename this to "enabled".

>
>>         supported queues: 0xf
>>         supported queues: 0xe
>>         minimum fragment size: 68
>
> I'm assuming this is the configuration knob for the minimal non-final fragment
> defined in 802.3br.
>
> My understanding from the specs is that this value must be a multiple from 64
> and cannot assume arbitrary values like shown here. See 99.4.7.3 from IEEE
> 802.3 and Note 1 in S.2 from IEEE 802.1Q. In the previous discussion about FP,
> we had this as a multiplier factor, not absolute value.

I thought that exposing this as "(1 + N)*64" (with 0 <= N <= 3) that it
was more related to what's exposed via LLDP than the actual capabilities
of the hardware. And for the hardware I have actually the values
supported are: (1 + N)*64 + 4 (for N = 0, 1, 2, 3).

So I thought I was better to let the driver decide what values are
acceptable.

This is a good question for people working with other hardware.

Murali Karicheri May 20, 2020, 12:47 p.m. UTC | #24

Hi Vinicius,

On 5/19/20 7:37 PM, Vinicius Costa Gomes wrote:
> Andre Guedes <andre.guedes@intel.com> writes:
> 
>> Hi,
>>
>> Quoting Vinicius Costa Gomes (2020-05-15 18:29:44)
>>> One example, for retrieving and setting the configuration:
>>>
>>> $ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
>>> Frame preemption settings for enp3s0:
>>>          support: supported
>>>          active: active
>>
>> IIUC the code in patch 2, 'active' is the actual configuration knob that
>> enables or disables the FP functionality on the NIC.
>>
>> That sounded a bit confusing to me since the spec uses the term 'active' to
>> indicate FP is currently enabled at both ends, and it is a read-only
>> information (see 12.30.1.4 from IEEE 802.1Q-2018). Maybe if we called this
>> 'enabled' it would be more clear.
> 
> Good point. Will rename this to "enabled".
> 
>>
>>>          supported queues: 0xf
>>>          supported queues: 0xe
>>>          minimum fragment size: 68
>>
>> I'm assuming this is the configuration knob for the minimal non-final fragment
>> defined in 802.3br.
>>
>> My understanding from the specs is that this value must be a multiple from 64
>> and cannot assume arbitrary values like shown here. See 99.4.7.3 from IEEE
>> 802.3 and Note 1 in S.2 from IEEE 802.1Q. In the previous discussion about FP,
>> we had this as a multiplier factor, not absolute value.
> 
> I thought that exposing this as "(1 + N)*64" (with 0 <= N <= 3) that it
> was more related to what's exposed via LLDP than the actual capabilities
> of the hardware. And for the hardware I have actually the values
> supported are: (1 + N)*64 + 4 (for N = 0, 1, 2, 3).
> 
> So I thought I was better to let the driver decide what values are
> acceptable.
> 
> This is a good question for people working with other hardware.
> 
> 
AM65 CPSW supports this as a multiple of 64. Since this ethtool
configuration is for the hardware, might want to make it as a multiple
of 64.

Murali

Joergen Andreasen May 20, 2020, 12:52 p.m. UTC | #25

The 05/19/2020 16:37, Vinicius Costa Gomes wrote:
> 
> Andre Guedes <andre.guedes@intel.com> writes:
> 
> > Hi,
> >
> > Quoting Vinicius Costa Gomes (2020-05-15 18:29:44)
> >> One example, for retrieving and setting the configuration:
> >>
> >> $ ethtool $ sudo ./ethtool --show-frame-preemption enp3s0
> >> Frame preemption settings for enp3s0:
> >>         support: supported
> >>         active: active
> >
> > IIUC the code in patch 2, 'active' is the actual configuration knob that
> > enables or disables the FP functionality on the NIC.
> >
> > That sounded a bit confusing to me since the spec uses the term 'active' to
> > indicate FP is currently enabled at both ends, and it is a read-only
> > information (see 12.30.1.4 from IEEE 802.1Q-2018). Maybe if we called this
> > 'enabled' it would be more clear.
> 
> Good point. Will rename this to "enabled".
> 
> >
> >>         supported queues: 0xf
> >>         supported queues: 0xe
> >>         minimum fragment size: 68
> >
> > I'm assuming this is the configuration knob for the minimal non-final fragment
> > defined in 802.3br.
> >
> > My understanding from the specs is that this value must be a multiple from 64
> > and cannot assume arbitrary values like shown here. See 99.4.7.3 from IEEE
> > 802.3 and Note 1 in S.2 from IEEE 802.1Q. In the previous discussion about FP,
> > we had this as a multiplier factor, not absolute value.
> 
> I thought that exposing this as "(1 + N)*64" (with 0 <= N <= 3) that it
> was more related to what's exposed via LLDP than the actual capabilities
> of the hardware. And for the hardware I have actually the values
> supported are: (1 + N)*64 + 4 (for N = 0, 1, 2, 3).
> 
> So I thought I was better to let the driver decide what values are
> acceptable.
> 
> This is a good question for people working with other hardware.
> 

I think it's most intuitive to use the values for AddFragSize as described in
802.3br (N = 0, 1, 2, 3).
You will anyway have to use one of these values when you want to expose the
requirements of your receiver through LLDP.

> 
> --
> Vinicius

Vinicius Costa Gomes May 20, 2020, 9:32 p.m. UTC | #26

Joergen Andreasen <joergen.andreasen@microchip.com> writes:

>> So I thought I was better to let the driver decide what values are
>> acceptable.
>> 
>> This is a good question for people working with other hardware.
>> 
>
> I think it's most intuitive to use the values for AddFragSize as described in
> 802.3br (N = 0, 1, 2, 3).
> You will anyway have to use one of these values when you want to expose the
> requirements of your receiver through LLDP.
>

Thanks. Seems that keeping this value restricted to multiples of 64 is
the way to go. Will fix for the next version of the series.

Cheers,

Andre Guedes May 20, 2020, 9:42 p.m. UTC | #27

Hi,

Quoting Jakub Kicinski (2020-05-18 16:09:06)
> On Mon, 18 May 2020 16:05:08 -0700 Vinicius Costa Gomes wrote:
> > Jakub Kicinski <kuba@kernel.org> writes:
> > >> That was the (only?) strong argument in favor of having frame preemption
> > >> in the TC side when this was last discussed.
> > >> 
> > >> We can have a hybrid solution, we can move the express/preemptible per
> > >> queue map to mqprio/taprio/whatever. And have the more specific
> > >> configuration knobs, minimum fragment size, etc, in ethtool.
> > >> 
> > >> What do you think?  
> > >
> > > Does the standard specify minimum fragment size as a global MAC setting?  
> > 
> > Yes, it's a per-MAC setting, not per-queue. 
> 
> If standard defines it as per-MAC and we can reasonably expect vendors
> won't try to "add value" and make it per queue (unlikely here AFAIU),
> then for this part ethtool configuration seems okay to me.

Before we move forward with this hybrid approach, let's recap a few points that
we discussed in the previous thread and make sure it addresses them properly.

1) Frame Preemption (FP) can be enabled without EST, as described in IEEE
802.1Q. In this case, the user has to create a dummy EST schedule in taprio
just to be able to enable FP, which doesn't look natural.

2) Mpqrio already looks overloaded. Besides mapping traffic classes into
hardware queues, it also supports different modes and traffic shaping. Do we
want to add yet another setting to it?

Regards,

Andre

Vinicius Costa Gomes May 20, 2020, 10:35 p.m. UTC | #28

Andre Guedes <andre.guedes@intel.com> writes:

>> If standard defines it as per-MAC and we can reasonably expect vendors
>> won't try to "add value" and make it per queue (unlikely here AFAIU),
>> then for this part ethtool configuration seems okay to me.
>
> Before we move forward with this hybrid approach, let's recap a few points that
> we discussed in the previous thread and make sure it addresses them
> properly.

Thanks for bringing them up.

>
> 1) Frame Preemption (FP) can be enabled without EST, as described in IEEE
> 802.1Q. In this case, the user has to create a dummy EST schedule in taprio
> just to be able to enable FP, which doesn't look natural.

What I meant by "dummy" schedule, is to configure taprio without
specifying any "sched-entry". And since we have support for adding
schedules during runtime, this might be even useful in general.

>
> 2) Mpqrio already looks overloaded. Besides mapping traffic classes into
> hardware queues, it also supports different modes and traffic shaping. Do we
> want to add yet another setting to it?

I also don't see this as a problem. The parameters that mqprio has carry
a lot of information, but the number of them is not that big.

Cheers,

[next-queue,RFC,0/4] ethtool: Add support for frame preemption

Message

Comments