diff mbox series

[iproute2-next,v3,2/2] tc: implement support for terse dump

Message ID 20201016144205.21787-3-vladbu@nvidia.com
State Changes Requested
Delegated to: David Ahern
Headers show
Series Implement filter terse dump mode support | expand

Commit Message

Vlad Buslov Oct. 16, 2020, 2:42 p.m. UTC
From: Vlad Buslov <vladbu@mellanox.com>

Implement support for classifier/action terse dump using new TCA_DUMP_FLAGS
tlv with only available flag value TCA_DUMP_FLAGS_TERSE. Set the flag when
user requested it with following example CLI (-br for 'brief'):

$ tc -s -brief filter show dev ens1f0 ingress
filter protocol all pref 49151 flower chain 0
filter protocol all pref 49151 flower chain 0 handle 0x1
  not_in_hw
        action order 1:         Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

filter protocol all pref 49152 flower chain 0
filter protocol all pref 49152 flower chain 0 handle 0x1
  not_in_hw
        action order 1:         Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

In terse mode dump only outputs essential data needed to identify the
filter and action (handle, cookie, etc.) and stats, if requested by the
user. The intention is to significantly improve rule dump rate by omitting
all static data that do not change after rule is created.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
---

Notes:
    Changes V2 -> V3:
    
    - Add brief dump output example to commit message.
    
    Changes V1 -> V2:
    
    - Invoke terse dump with tc command '-brief' option instead of
    filter-specific 'terse' flag.
    
    - Extend tc man and usage string with new option.

 man/man8/tc.8  | 6 ++++++
 tc/tc.c        | 6 +++++-
 tc/tc_filter.c | 9 +++++++++
 3 files changed, 20 insertions(+), 1 deletion(-)

Comments

Jamal Hadi Salim Oct. 16, 2020, 4:07 p.m. UTC | #1
On 2020-10-16 10:42 a.m., Vlad Buslov wrote:
> From: Vlad Buslov <vladbu@mellanox.com>
> 
> Implement support for classifier/action terse dump using new TCA_DUMP_FLAGS
> tlv with only available flag value TCA_DUMP_FLAGS_TERSE. Set the flag when
> user requested it with following example CLI (-br for 'brief'):
> 
> $ tc -s -brief filter show dev ens1f0 ingress
> filter protocol all pref 49151 flower chain 0
> filter protocol all pref 49151 flower chain 0 handle 0x1
>    not_in_hw
>          action order 1:         Action statistics:
>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>          backlog 0b 0p requeues 0
> 
> filter protocol all pref 49152 flower chain 0
> filter protocol all pref 49152 flower chain 0 handle 0x1
>    not_in_hw
>          action order 1:         Action statistics:
>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>          backlog 0b 0p requeues 0
> 

Should the action name at least show up?


cheers,
jamal
Vlad Buslov Oct. 16, 2020, 4:42 p.m. UTC | #2
On Fri 16 Oct 2020 at 19:07, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2020-10-16 10:42 a.m., Vlad Buslov wrote:
>> From: Vlad Buslov <vladbu@mellanox.com>
>>
>> Implement support for classifier/action terse dump using new TCA_DUMP_FLAGS
>> tlv with only available flag value TCA_DUMP_FLAGS_TERSE. Set the flag when
>> user requested it with following example CLI (-br for 'brief'):
>>
>> $ tc -s -brief filter show dev ens1f0 ingress
>> filter protocol all pref 49151 flower chain 0
>> filter protocol all pref 49151 flower chain 0 handle 0x1
>>    not_in_hw
>>          action order 1:         Action statistics:
>>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>
>> filter protocol all pref 49152 flower chain 0
>> filter protocol all pref 49152 flower chain 0 handle 0x1
>>    not_in_hw
>>          action order 1:         Action statistics:
>>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>
>
> Should the action name at least show up?
>
>
> cheers,
> jamal

All action print callbacks have arg==NULL check and return at the
beginning. To print action type we need either to have dedicated
'brief_dump' callback instead of reusing print_aop() or extend/refactor
print_aop() implementation for all actions to always print the type
before checking the arg. What do you suggest?
Jamal Hadi Salim Oct. 17, 2020, 11:20 a.m. UTC | #3
On 2020-10-16 12:42 p.m., Vlad Buslov wrote:

> 
> All action print callbacks have arg==NULL check and return at the
> beginning. To print action type we need either to have dedicated
> 'brief_dump' callback instead of reusing print_aop() or extend/refactor
> print_aop() implementation for all actions to always print the type
> before checking the arg. What do you suggest?
> 

Either one sounds appealing - the refactoring feels simpler
as opposed to a->terse_print().

BTW: the action index, unless i missed something, is not transported
from the kernel for terse option. It is an important parameter
when actions are shared by filters (since they will have the same
index).
Am i missing something?

cheers,
jamal
Vlad Buslov Oct. 18, 2020, 12:16 p.m. UTC | #4
On Sat 17 Oct 2020 at 14:20, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2020-10-16 12:42 p.m., Vlad Buslov wrote:
>
>>
>> All action print callbacks have arg==NULL check and return at the
>> beginning. To print action type we need either to have dedicated
>> 'brief_dump' callback instead of reusing print_aop() or extend/refactor
>> print_aop() implementation for all actions to always print the type
>> before checking the arg. What do you suggest?
>>
>
> Either one sounds appealing - the refactoring feels simpler
> as opposed to a->terse_print().

With such refactoring we action type will be printed before some basic
validation which can lead to outputting the type together with error
message. Consider tunnel key action output callback as example:

static int print_tunnel_key(struct action_util *au, FILE *f, struct rtattr *arg)
{
	struct rtattr *tb[TCA_TUNNEL_KEY_MAX + 1];
	struct tc_tunnel_key *parm;

	if (!arg)
		return 0;

	parse_rtattr_nested(tb, TCA_TUNNEL_KEY_MAX, arg);

	if (!tb[TCA_TUNNEL_KEY_PARMS]) {
		fprintf(stderr, "Missing tunnel_key parameters\n");
		return -1;
	}
	parm = RTA_DATA(tb[TCA_TUNNEL_KEY_PARMS]);

	print_string(PRINT_ANY, "kind", "%s ", "tunnel_key");

If print "kind" call is moved before checking the arg it will always be
printed, even when immediately followed by "Missing tunnel_key
parameters\n" string. Is this a concern?

>
> BTW: the action index, unless i missed something, is not transported
> from the kernel for terse option. It is an important parameter
> when actions are shared by filters (since they will have the same
> index).
> Am i missing something?

Yes, tc_action_ops->dump(), which outputs action index among other data,
is not called at all by terse dump.

>
> cheers,
> jamal
Jamal Hadi Salim Oct. 19, 2020, 1:48 p.m. UTC | #5
On 2020-10-18 8:16 a.m., Vlad Buslov wrote:
> On Sat 17 Oct 2020 at 14:20, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>> On 2020-10-16 12:42 p.m., Vlad Buslov wrote:
>>

>> Either one sounds appealing - the refactoring feels simpler
>> as opposed to a->terse_print().
> 
> With such refactoring we action type will be printed before some basic
> validation which can lead to outputting the type together with error
> message. Consider tunnel key action output callback as example:
> 
> static int print_tunnel_key(struct action_util *au, FILE *f, struct rtattr *arg)
> {
> 	struct rtattr *tb[TCA_TUNNEL_KEY_MAX + 1];
> 	struct tc_tunnel_key *parm;
> 
> 	if (!arg)
> 		return 0;
> 
> 	parse_rtattr_nested(tb, TCA_TUNNEL_KEY_MAX, arg);
> 
> 	if (!tb[TCA_TUNNEL_KEY_PARMS]) {
> 		fprintf(stderr, "Missing tunnel_key parameters\n");
> 		return -1;
> 	}
> 	parm = RTA_DATA(tb[TCA_TUNNEL_KEY_PARMS]);
> 
> 	print_string(PRINT_ANY, "kind", "%s ", "tunnel_key");
> 
> If print "kind" call is moved before checking the arg it will always be
> printed, even when immediately followed by "Missing tunnel_key
> parameters\n" string. Is this a concern?
> 

That could be a good thing, no? you get to see the action name with the
error. Its really not a big deal if you decide to do a->terse_print()
instead.

>>
>> BTW: the action index, unless i missed something, is not transported
>> from the kernel for terse option. It is an important parameter
>> when actions are shared by filters (since they will have the same
>> index).
>> Am i missing something?
> 
> Yes, tc_action_ops->dump(), which outputs action index among other data,
> is not called at all by terse dump.

I am suggesting it is an important detail that is currently missing.
Alternatively since you have the cookies in there - it is feasible that
someone who creates the action could "encode" the index in the cookie.
But that makes it a "proprietary" choice of whoever is creating
the filter/action.

cheers,
jamal
Vlad Buslov Oct. 19, 2020, 3:18 p.m. UTC | #6
On Mon 19 Oct 2020 at 16:48, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2020-10-18 8:16 a.m., Vlad Buslov wrote:
>> On Sat 17 Oct 2020 at 14:20, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>>> On 2020-10-16 12:42 p.m., Vlad Buslov wrote:
>>>
>
>>> Either one sounds appealing - the refactoring feels simpler
>>> as opposed to a->terse_print().
>>
>> With such refactoring we action type will be printed before some basic
>> validation which can lead to outputting the type together with error
>> message. Consider tunnel key action output callback as example:
>>
>> static int print_tunnel_key(struct action_util *au, FILE *f, struct rtattr *arg)
>> {
>> 	struct rtattr *tb[TCA_TUNNEL_KEY_MAX + 1];
>> 	struct tc_tunnel_key *parm;
>>
>> 	if (!arg)
>> 		return 0;
>>
>> 	parse_rtattr_nested(tb, TCA_TUNNEL_KEY_MAX, arg);
>>
>> 	if (!tb[TCA_TUNNEL_KEY_PARMS]) {
>> 		fprintf(stderr, "Missing tunnel_key parameters\n");
>> 		return -1;
>> 	}
>> 	parm = RTA_DATA(tb[TCA_TUNNEL_KEY_PARMS]);
>>
>> 	print_string(PRINT_ANY, "kind", "%s ", "tunnel_key");
>>
>> If print "kind" call is moved before checking the arg it will always be
>> printed, even when immediately followed by "Missing tunnel_key
>> parameters\n" string. Is this a concern?
>>
>
> That could be a good thing, no? you get to see the action name with the
> error. Its really not a big deal if you decide to do a->terse_print()
> instead.

Maybe. Just saying that this change would also change user-visible
iproute2 behavior.

>
>>>
>>> BTW: the action index, unless i missed something, is not transported
>>> from the kernel for terse option. It is an important parameter
>>> when actions are shared by filters (since they will have the same
>>> index).
>>> Am i missing something?
>>
>> Yes, tc_action_ops->dump(), which outputs action index among other data,
>> is not called at all by terse dump.
>
> I am suggesting it is an important detail that is currently missing.
> Alternatively since you have the cookies in there - it is feasible that
> someone who creates the action could "encode" the index in the cookie.
> But that makes it a "proprietary" choice of whoever is creating
> the filter/action.

It is not a trivial change. To get this data we need to call
tc_action_ops->dump() which puts bunch of other unrelated info in
TCA_OPTIONS nested attr. This hurts both dump size and runtime
performance. Even if we add another argument to dump "terse dump, print
only index", index is still part of larger options structure which
includes at least following fields:

#define tc_gen \
	__u32                 index; \
	__u32                 capab; \
	int                   action; \
	int                   refcnt; \
	int                   bindcnt

This wouldn't be much of a terse dump anymore. What prevents user that
needs all action info from calling regular dump? It is not like terse
dump substitutes it or somehow makes it harder to use.

>
> cheers,
> jamal
Jamal Hadi Salim Oct. 20, 2020, 12:29 p.m. UTC | #7
On 2020-10-19 11:18 a.m., Vlad Buslov wrote:
> 
> On Mon 19 Oct 2020 at 16:48, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>> On 2020-10-18 8:16 a.m., Vlad Buslov wrote:

[..]

>> That could be a good thing, no? you get to see the action name with the
>> error. Its really not a big deal if you decide to do a->terse_print()
>> instead.
> 
> Maybe. Just saying that this change would also change user-visible
> iproute2 behavior.
> 

You are right(for the non-terse output). tbh, not sure if it is a big
deal given it happens only for the error case (where scripts look
for exit codes typically); having said that:
a ->terse_print() would be ok

> It is not a trivial change. To get this data we need to call
> tc_action_ops->dump() which puts bunch of other unrelated info in
> TCA_OPTIONS nested attr. This hurts both dump size and runtime
> performance. Even if we add another argument to dump "terse dump, print
> only index", index is still part of larger options structure which
> includes at least following fields:
> 
> #define tc_gen \
> 	__u32                 index; \
> 	__u32                 capab; \
> 	int                   action; \
> 	int                   refcnt; \
> 	int                   bindcnt
>


index is the _only_ important field for analytics purposes in that list.
i.e if i know the index i can correlate stats with one or more
filters (whether shared or not).
My worry is you have a very specific use case for your hardware or
maybe it is ovs - where counters are uniquely tied to filters and
there is no sharing. And possibly maybe only one counter can be tied
to a filter (was not sure if you could handle more than one action
in the terse from looking at the code).
Our assumptions so far had no such constraints.
Maybe a new TERSE_OPTIONS TLV, and then add an extra flag
to indicate interest in the tlv? Peharps store the stats in it as well.

> This wouldn't be much of a terse dump anymore. What prevents user that
> needs all action info from calling regular dump? It is not like terse
> dump substitutes it or somehow makes it harder to use.

Both scaling and correctness are important. You have the cookie
in the terse dump, thats a lot of data.
In our case we totally bypass filters to reduce the amount of data
crossing to user space (tc action ls). Theres still a lot of data
crossing which we could trim with a terse dump. All we are interested
in are stats. Another alternative is perhaps to introduce the index for
the direct dump.

cheers,
jamal
Vlad Buslov Oct. 21, 2020, 8:19 a.m. UTC | #8
On Tue 20 Oct 2020 at 15:29, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2020-10-19 11:18 a.m., Vlad Buslov wrote:
>>
>> On Mon 19 Oct 2020 at 16:48, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>>> On 2020-10-18 8:16 a.m., Vlad Buslov wrote:
>
> [..]
>
>>> That could be a good thing, no? you get to see the action name with the
>>> error. Its really not a big deal if you decide to do a->terse_print()
>>> instead.
>>
>> Maybe. Just saying that this change would also change user-visible
>> iproute2 behavior.
>>
>
> You are right(for the non-terse output). tbh, not sure if it is a big
> deal given it happens only for the error case (where scripts look
> for exit codes typically); having said that:
> a ->terse_print() would be ok
>
>> It is not a trivial change. To get this data we need to call
>> tc_action_ops->dump() which puts bunch of other unrelated info in
>> TCA_OPTIONS nested attr. This hurts both dump size and runtime
>> performance. Even if we add another argument to dump "terse dump, print
>> only index", index is still part of larger options structure which
>> includes at least following fields:
>>
>> #define tc_gen \
>> 	__u32                 index; \
>> 	__u32                 capab; \
>> 	int                   action; \
>> 	int                   refcnt; \
>> 	int                   bindcnt
>>
>
>
> index is the _only_ important field for analytics purposes in that list.
> i.e if i know the index i can correlate stats with one or more
> filters (whether shared or not).
> My worry is you have a very specific use case for your hardware or
> maybe it is ovs - where counters are uniquely tied to filters and
> there is no sharing. And possibly maybe only one counter can be tied
> to a filter (was not sure if you could handle more than one action
> in the terse from looking at the code).

OVS uses cookie to uniquely identify the flow and it does support
multiple actions per flow.

> Our assumptions so far had no such constraints.
> Maybe a new TERSE_OPTIONS TLV, and then add an extra flag
> to indicate interest in the tlv? Peharps store the stats in it as well.

Maybe, but wouldn't that require making it a new dump mode? Current
terse dump is already in released kernel and this seems like a
backward-incompatible change.

>
>> This wouldn't be much of a terse dump anymore. What prevents user that
>> needs all action info from calling regular dump? It is not like terse
>> dump substitutes it or somehow makes it harder to use.
>
> Both scaling and correctness are important. You have the cookie
> in the terse dump, thats a lot of data.

Cookie only consumes space in resulting netlink packet if used set the
cookie during action init. Otherwise, the cookie attribute is omitted.

> In our case we totally bypass filters to reduce the amount of data
> crossing to user space (tc action ls). Theres still a lot of data
> crossing which we could trim with a terse dump. All we are interested
> in are stats. Another alternative is perhaps to introduce the index for
> the direct dump.

What is the direct dump?

>
> cheers,
> jamal
Jamal Hadi Salim Oct. 22, 2020, 2:05 p.m. UTC | #9
On 2020-10-21 4:19 a.m., Vlad Buslov wrote:
> 
> On Tue 20 Oct 2020 at 15:29, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>> On 2020-10-19 11:18 a.m., Vlad Buslov wrote:
>> My worry is you have a very specific use case for your hardware or
>> maybe it is ovs - where counters are uniquely tied to filters and
>> there is no sharing. And possibly maybe only one counter can be tied
>> to a filter (was not sure if you could handle more than one action
>> in the terse from looking at the code).
> 
> OVS uses cookie to uniquely identify the flow and it does support
> multiple actions per flow.


ok, so they use it like a flowid/classid to identify the flow.
In our use case the cookie stores all kinds of other state that
the controller can avoid to lookup after a query.
index otoh is universal i.e two different users can intepret it
per action tying it specific stats.
IOW: I dont think it replaces the index.
Do they set cookies on all actions in a flow?


>> Our assumptions so far had no such constraints.
>> Maybe a new TERSE_OPTIONS TLV, and then add an extra flag
>> to indicate interest in the tlv? Peharps store the stats in it as well.
> 
> Maybe, but wouldn't that require making it a new dump mode? Current
> terse dump is already in released kernel and this seems like a
> backward-incompatible change.
> 

I meant you would set a new flag(in addition to TERSE) in a request to
the kernel to ask for the index to be made available on the response.
Response comes back in a TLV with just index in it for now.

>>
>>> This wouldn't be much of a terse dump anymore. What prevents user that
>>> needs all action info from calling regular dump? It is not like terse
>>> dump substitutes it or somehow makes it harder to use.
>>
>> Both scaling and correctness are important. You have the cookie
>> in the terse dump, thats a lot of data.
> 
> Cookie only consumes space in resulting netlink packet if used set the
> cookie during action init. Otherwise, the cookie attribute is omitted.

True, but: I am wondering why it is even considered in when terseness
was a requirement (and index was left out).

>> In our case we totally bypass filters to reduce the amount of data
>> crossing to user space (tc action ls). Theres still a lot of data
>> crossing which we could trim with a terse dump. All we are interested
>> in are stats. Another alternative is perhaps to introduce the index for
>> the direct dump.
> 
> What is the direct dump?

tc action ls ...
Like i said in our use cases to get the stats we just dumped the actions
we wanted. It is a lot less data than having the filter + actions.
And with your idea of terseness we can trim down further how much
data by removing all the action attributes coming back if we set TERSE
flag in such a request. But the index has to be there to make sense.

cheers,
jamal
diff mbox series

Patch

diff --git a/man/man8/tc.8 b/man/man8/tc.8
index 7e9019f561ea..e8622053df65 100644
--- a/man/man8/tc.8
+++ b/man/man8/tc.8
@@ -854,6 +854,12 @@  option for creating
 alias.
 .RE
 
+.TP
+.BR "\-br" , " \-brief"
+Print only essential data needed to identify the filter and action (handle,
+cookie, etc.) and stats. This option is currently only supported by
+.BR "tc filter show " command.
+
 .SH "EXAMPLES"
 .PP
 tc -g class show dev eth0
diff --git a/tc/tc.c b/tc/tc.c
index 5d57054b45fb..bdd5d4faf886 100644
--- a/tc/tc.c
+++ b/tc/tc.c
@@ -44,6 +44,7 @@  bool use_names;
 int json;
 int color;
 int oneline;
+int brief;
 
 static char *conf_file;
 
@@ -202,7 +203,8 @@  static void usage(void)
 		"       OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[aw] |\n"
 		"		    -o[neline] | -j[son] | -p[retty] | -c[olor]\n"
 		"		    -b[atch] [filename] | -n[etns] name | -N[umeric] |\n"
-		"		     -nm | -nam[es] | { -cf | -conf } path }\n");
+		"		     -nm | -nam[es] | { -cf | -conf } path\n"
+		"		     -br[ief] }\n");
 }
 
 static int do_cmd(int argc, char **argv)
@@ -336,6 +338,8 @@  int main(int argc, char **argv)
 			++json;
 		} else if (matches(argv[1], "-oneline") == 0) {
 			++oneline;
+		}else if (matches(argv[1], "-brief") == 0) {
+			++brief;
 		} else {
 			fprintf(stderr,
 				"Option \"%s\" is unknown, try \"tc -help\".\n",
diff --git a/tc/tc_filter.c b/tc/tc_filter.c
index c591a19f3123..71be2e8119c9 100644
--- a/tc/tc_filter.c
+++ b/tc/tc_filter.c
@@ -721,6 +721,15 @@  static int tc_filter_list(int cmd, int argc, char **argv)
 	if (filter_chain_index_set)
 		addattr32(&req.n, sizeof(req), TCA_CHAIN, chain_index);
 
+	if (brief) {
+		struct nla_bitfield32 flags = {
+			.value = TCA_DUMP_FLAGS_TERSE,
+			.selector = TCA_DUMP_FLAGS_TERSE
+		};
+
+		addattr_l(&req.n, MAX_MSG, TCA_DUMP_FLAGS, &flags, sizeof(flags));
+	}
+
 	if (rtnl_dump_request_n(&rth, &req.n) < 0) {
 		perror("Cannot send dump request");
 		return 1;