mbox series

[iproute2,net-next,v12,0/3] tc: shared block support

Message ID 20180120100029.886-1-jiri@resnulli.us
Headers show
Series tc: shared block support | expand

Message

Jiri Pirko Jan. 20, 2018, 10 a.m. UTC
From: Jiri Pirko <jiri@mellanox.com>

Kernel allows to share all filters between qdiscs with use
of shared block.

Example:

block number 22. "22" is just an identification:
$ tc qdisc add dev ens7 ingress_block 22 ingress
                        ^^^^^^^^^^^^^^^^
$ tc qdisc add dev ens8 ingress_block 22 ingress
                        ^^^^^^^^^^^^^^^^

If we don't specify "block" command line option, no shared block would
be created:
$ tc qdisc add dev ens9 ingress

Now if we list the qdiscs, we will see the block index in the output:

$ tc qdisc
qdisc ingress ffff: dev ens7 parent ffff:fff1 ingress_block 22
qdisc ingress ffff: dev ens8 parent ffff:fff1 ingress_block 22
qdisc ingress ffff: dev ens9 parent ffff:fff1


To make is more visual, the situation looks like this:

   ens7 ingress qdisc                 ens7 ingress qdisc
          |                                  |
          |                                  |
          +---------->  block 22  <----------+

Unlimited number of qdiscs may share the same block.

Block sharing is also supported for clsact qdisc:
$ tc qdisc add dev ens10 ingress_block 23 egress_block 24 clsact
$ tc qdisc show dev ens10
qdisc clsact ffff: dev ens10 parent ffff:fff1 ingress_block 23 egress_block 24


We can add filter using the block index:

$ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop


Note we cannot use the qdisc for filter manipulations of shared blocks:

$ tc filter add dev ens8 ingress protocol ip pref 1 flower dst_ip 192.168.100.2 action drop
Error: This filter block is shared. Please use the block index to manipulate the filters.


We will see the same output if we list filters for ingress qdisc of
ens7 and ens8, also for the block 22:

$ tc filter show block 22
filter protocol ip pref 25 flower chain 0
filter protocol ip pref 25 flower chain 0 handle 0x1
...

$ tc filter show dev ens7 ingress
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...

$ tc filter show dev ens8 ingress
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...

---
v11->v12:
- separate iproute2 patchset, kernel part is merged
- original patch 1 is removed as the header changes are already
  in iproute2 code
- patch 1:
 - fixed return type of tc_qdisc_block_exists
 - removed 0 checks for block in tc_qdisc_block_exists
 - rever xmas tree variables in tc_qdisc_block_exists
- patch 2
 - fixes error message when both dev and block are on the command line

---
Note that if you want me to change the ">=" vs "==" thing, I will do it.
But I think that it is a global iproute2 thing and should be changed in
a separate patch/patchset.

Jiri Pirko (3):
  tc: introduce tc_qdisc_block_exists helper
  tc: introduce support for block-handle for filter operations
  tc: implement ingress/egress block index attributes for qdiscs

 man/man8/tc.8  |  24 ++++++++++++-
 tc/tc_filter.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++----------
 tc/tc_qdisc.c  |  97 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tc/tc_util.h   |   2 ++
 4 files changed, 210 insertions(+), 18 deletions(-)

Comments

David Ahern Jan. 21, 2018, 7:25 p.m. UTC | #1
On 1/20/18 3:00 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Kernel allows to share all filters between qdiscs with use
> of shared block.
> 
> Example:
> 
> block number 22. "22" is just an identification:
> $ tc qdisc add dev ens7 ingress_block 22 ingress
>                         ^^^^^^^^^^^^^^^^
> $ tc qdisc add dev ens8 ingress_block 22 ingress
>                         ^^^^^^^^^^^^^^^^
> 
> If we don't specify "block" command line option, no shared block would
> be created:
> $ tc qdisc add dev ens9 ingress
> 
> Now if we list the qdiscs, we will see the block index in the output:
> 
> $ tc qdisc
> qdisc ingress ffff: dev ens7 parent ffff:fff1 ingress_block 22
> qdisc ingress ffff: dev ens8 parent ffff:fff1 ingress_block 22
> qdisc ingress ffff: dev ens9 parent ffff:fff1
> 
> 
> To make is more visual, the situation looks like this:
> 
>    ens7 ingress qdisc                 ens7 ingress qdisc
>           |                                  |
>           |                                  |
>           +---------->  block 22  <----------+
> 
> Unlimited number of qdiscs may share the same block.
> 
> Block sharing is also supported for clsact qdisc:
> $ tc qdisc add dev ens10 ingress_block 23 egress_block 24 clsact
> $ tc qdisc show dev ens10
> qdisc clsact ffff: dev ens10 parent ffff:fff1 ingress_block 23 egress_block 24
> 
> 
> We can add filter using the block index:
> 
> $ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop
> 
> 
> Note we cannot use the qdisc for filter manipulations of shared blocks:
> 
> $ tc filter add dev ens8 ingress protocol ip pref 1 flower dst_ip 192.168.100.2 action drop
> Error: This filter block is shared. Please use the block index to manipulate the filters.
> 
> 
> We will see the same output if we list filters for ingress qdisc of
> ens7 and ens8, also for the block 22:
> 
> $ tc filter show block 22
> filter protocol ip pref 25 flower chain 0
> filter protocol ip pref 25 flower chain 0 handle 0x1
> ...
> 
> $ tc filter show dev ens7 ingress
> filter block 22 protocol ip pref 25 flower chain 0
> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
> ...
> 
> $ tc filter show dev ens8 ingress
> filter block 22 protocol ip pref 25 flower chain 0
> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
> ...
> 
> ---
> v11->v12:
> - separate iproute2 patchset, kernel part is merged
> - original patch 1 is removed as the header changes are already
>   in iproute2 code
> - patch 1:
>  - fixed return type of tc_qdisc_block_exists
>  - removed 0 checks for block in tc_qdisc_block_exists
>  - rever xmas tree variables in tc_qdisc_block_exists
> - patch 2
>  - fixes error message when both dev and block are on the command line
> 
> ---
> Note that if you want me to change the ">=" vs "==" thing, I will do it.
> But I think that it is a global iproute2 thing and should be changed in
> a separate patch/patchset.

IMO, it should be checking for ==. If the payload is not of an expected
length, then either the kernel and userspace differ on the API or the
parsing has gone off the rails.

FWIW, all of the the 'RTA_PAYLOAD >=' checks are under tc/.

Series applied to iproute2-next. Thanks, Jiri.