mbox series

[net-next,v2,0/6] RED: Introduce an ECN tail-dropping mode

Message ID 20200311173356.38181-1-petrm@mellanox.com
Headers show
Series RED: Introduce an ECN tail-dropping mode | expand

Message

Petr Machata March 11, 2020, 5:33 p.m. UTC
When the RED qdisc is currently configured to enable ECN, the RED algorithm
is used to decide whether a certain SKB should be marked. If that SKB is
not ECN-capable, it is early-dropped.

It is also possible to keep all traffic in the queue, and just mark the
ECN-capable subset of it, as appropriate under the RED algorithm. Some
switches support this mode, and some installations make use of it.
There is currently no way to put the RED qdiscs to this mode.

Therefore this patchset adds a new RED flag, TC_RED_TAILDROP. When the
qdisc is configured with this flag, non-ECT traffic is enqueued (and
tail-dropped when the queue size is exhausted) instead of being
early-dropped.

Unfortunately, adding a new RED flag is not as simple as it sounds. RED
flags are passed in tc_red_qopt.flags. However RED neglects to validate the
flag field, and just copies it over wholesale to its internal structure,
and later dumps it back.

A broken userspace can therefore configure a RED qdisc with arbitrary
unsupported flags, and later expect to see the flags on qdisc dump. The
current ABI thus allows storage of 5 bits of custom data along with the
qdisc instance.

GRED, SFQ and CHOKE qdiscs are in the same situation. (GRED validates VQ
flags, but not the flags for the main queue.) E.g. if SFQ ever needs to
support TC_RED_ADAPTATIVE, it needs another way of doing it, and at the
same time it needs to retain the possibility to store 6 bits of
uninterpreted data.

For RED, this problem is resolved in patch #2, which adds a new attribute,
and a way to separate flags from userbits that can be reused by other
qdiscs. The flag itself and related behavioral changes are added in patch
#3. In patch #4, the new mode is offloaded by mlxsw.

To test the new feature, patch #1 first introduces a TDC testsuite that
covers the existing RED flags. Patch #5 later extends it with taildrop
coverage. Patch #6 contains a forwarding selftest for the offloaded
datapath.

To test the SW datapath, I took the mlxsw selftest and adapted it in mostly
obvious ways. The test is stable enough to verify that RED, ECN and ECN
taildrop actually work. However, I have no confidence in its portability to
other people's machines or mildly different configurations. I therefore do
not find it suitable for upstreaming.

GRED and CHOKE can use the same method as RED if they ever need to support
extra flags. SFQ uses the length of TCA_OPTIONS to dispatch on binary
control structure version, and would therefore need a different approach.

v2:
- Patch #1
    - Require nsPlugin in each RED test
    - Match end-of-line to catch cases of more flags reported than
      requested
- Patch #2:
    - Replaced with another patch.
- Patch #3:
    - Fix red_use_taildrop() condition in red_enqueue switch for
      probabilistic case.
- Patch #5:
    - Require nsPlugin in each RED test
    - Match end-of-line to catch cases of more flags reported than
      requested
    - Add a test for creation of non-ECN taildrop, which should fail

Petr Machata (6):
  selftests: qdiscs: Add TDC test for RED
  net: sched: Allow extending set of supported RED flags
  net: sched: RED: Introduce an ECN tail-dropping mode
  mlxsw: spectrum_qdisc: Offload RED ECN tail-dropping mode
  selftests: qdiscs: RED: Add taildrop tests
  selftests: mlxsw: RED: Test RED ECN taildrop offload

 .../ethernet/mellanox/mlxsw/spectrum_qdisc.c  |   9 +-
 include/net/pkt_cls.h                         |   1 +
 include/net/red.h                             |  30 +++
 include/uapi/linux/pkt_sched.h                |  17 ++
 net/sched/sch_red.c                           |  53 ++++-
 .../drivers/net/mlxsw/sch_red_core.sh         |  50 ++++-
 .../drivers/net/mlxsw/sch_red_ets.sh          |  11 ++
 .../drivers/net/mlxsw/sch_red_root.sh         |   8 +
 .../tc-testing/tc-tests/qdiscs/red.json       | 185 ++++++++++++++++++
 9 files changed, 344 insertions(+), 20 deletions(-)
 create mode 100644 tools/testing/selftests/tc-testing/tc-tests/qdiscs/red.json

Comments

David Miller March 15, 2020, 4:04 a.m. UTC | #1
From: Petr Machata <petrm@mellanox.com>
Date: Wed, 11 Mar 2020 19:33:50 +0200

> When the RED qdisc is currently configured to enable ECN, the RED algorithm
> is used to decide whether a certain SKB should be marked. If that SKB is
> not ECN-capable, it is early-dropped.
> 
> It is also possible to keep all traffic in the queue, and just mark the
> ECN-capable subset of it, as appropriate under the RED algorithm. Some
> switches support this mode, and some installations make use of it.
> There is currently no way to put the RED qdiscs to this mode.
> 
> Therefore this patchset adds a new RED flag, TC_RED_TAILDROP. When the
> qdisc is configured with this flag, non-ECT traffic is enqueued (and
> tail-dropped when the queue size is exhausted) instead of being
> early-dropped.
 ...

Series applied, thank you.
Petr Machata March 16, 2020, 10:54 a.m. UTC | #2
David Miller <davem@davemloft.net> writes:

> From: Petr Machata <petrm@mellanox.com>
> Date: Wed, 11 Mar 2020 19:33:50 +0200
>
>> When the RED qdisc is currently configured to enable ECN, the RED algorithm
>> is used to decide whether a certain SKB should be marked. If that SKB is
>> not ECN-capable, it is early-dropped.
>>
>> It is also possible to keep all traffic in the queue, and just mark the
>> ECN-capable subset of it, as appropriate under the RED algorithm. Some
>> switches support this mode, and some installations make use of it.
>> There is currently no way to put the RED qdiscs to this mode.
>>
>> Therefore this patchset adds a new RED flag, TC_RED_TAILDROP. When the
>> qdisc is configured with this flag, non-ECT traffic is enqueued (and
>> tail-dropped when the queue size is exhausted) instead of being
>> early-dropped.
>  ...
>
> Series applied, thank you.

Dave, there were v3 and v4 for this patchset as well. They had a
different subject, s/taildrop/nodrop/, hence the confusion I think.
Should I send a delta patch with just the changes, or do you want to
revert-and-reapply, or...?
David Miller March 16, 2020, 9:55 p.m. UTC | #3
From: Petr Machata <petrm@mellanox.com>
Date: Mon, 16 Mar 2020 11:54:51 +0100

> Dave, there were v3 and v4 for this patchset as well. They had a
> different subject, s/taildrop/nodrop/, hence the confusion I think.
> Should I send a delta patch with just the changes, or do you want to
> revert-and-reapply, or...?

I prefer deltas, thank you.
Petr Machata March 17, 2020, 12:43 p.m. UTC | #4
David Miller <davem@davemloft.net> writes:

> From: Petr Machata <petrm@mellanox.com>
> Date: Mon, 16 Mar 2020 11:54:51 +0100
>
>> Dave, there were v3 and v4 for this patchset as well. They had a
>> different subject, s/taildrop/nodrop/, hence the confusion I think.
>> Should I send a delta patch with just the changes, or do you want to
>> revert-and-reapply, or...?
>
> I prefer deltas, thank you.

Never mind, it's just the merge commit that contains v2, the actual
patches are v4.