diff mbox

[iproute2,net-next] tc: flower: support matching flags

Message ID 1482930409-55059-1-git-send-email-paulb@mellanox.com
State Accepted, archived
Delegated to: stephen hemminger
Headers show

Commit Message

Paul Blakey Dec. 28, 2016, 1:06 p.m. UTC
Enhance flower to support matching on flags.

The 1st flag allows to match on whether the packet is
an IP fragment.

Example:

	# add a flower filter that will drop fragmented packets
	# (bit 0 of control flags)
	tc filter add dev ens4f0 protocol ip parent ffff: \
		flower \
		src_mac e4:1d:2d:fd:8b:01 \
		dst_mac e4:1d:2d:fd:8b:02 \
		indev ens4f0 \
		matching_flags 0x1/0x1 \
	action drop

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
---
 man/man8/tc-flower.8 | 11 +++++++++++
 tc/f_flower.c        | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 63 insertions(+), 1 deletion(-)

Comments

Stephen Hemminger Dec. 29, 2016, 6:43 p.m. UTC | #1
On Wed, 28 Dec 2016 15:06:49 +0200
Paul Blakey <paulb@mellanox.com> wrote:

> Enhance flower to support matching on flags.
> 
> The 1st flag allows to match on whether the packet is
> an IP fragment.
> 
> Example:
> 
> 	# add a flower filter that will drop fragmented packets
> 	# (bit 0 of control flags)
> 	tc filter add dev ens4f0 protocol ip parent ffff: \
> 		flower \
> 		src_mac e4:1d:2d:fd:8b:01 \
> 		dst_mac e4:1d:2d:fd:8b:02 \
> 		indev ens4f0 \
> 		matching_flags 0x1/0x1 \
> 	action drop
> 
> Signed-off-by: Paul Blakey <paulb@mellanox.com>
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> Reviewed-by: Roi Dayan <roid@mellanox.com>

Applied. Had to manually fixup merge conflicts with other flower changes.
Jiri Benc Jan. 2, 2017, 6:55 p.m. UTC | #2
On Wed, 28 Dec 2016 15:06:49 +0200, Paul Blakey wrote:
> Enhance flower to support matching on flags.
> 
> The 1st flag allows to match on whether the packet is
> an IP fragment.
> 
> Example:
> 
> 	# add a flower filter that will drop fragmented packets
> 	# (bit 0 of control flags)
> 	tc filter add dev ens4f0 protocol ip parent ffff: \
> 		flower \
> 		src_mac e4:1d:2d:fd:8b:01 \
> 		dst_mac e4:1d:2d:fd:8b:02 \
> 		indev ens4f0 \
> 		matching_flags 0x1/0x1 \
> 	action drop

This is very poor API. First, how is the user supposed to know what
those magic values in "matching_flags" mean? At the very least, it
should be documented in the man page.

Second, why "matching_flags"? That name suggests that those modify the
way the matching is done (to illustrate my point, I'd expect things
like "if the packet is too short, match this rule anyway" to be a
"matching flag"). But this is not the case. What's wrong with plain
"flags"? Or, if you want to be more specific, perhaps packet_flags?

Third, all of this looks very wrong anyway. There should be separate
keywords for individual flags. In this case, there should be an
"ip_fragment" flag. The tc tool should be responsible for putting the
flags together and creating the appropriate mask. The example would
then be:

	tc filter add dev ens4f0 protocol ip parent ffff: \
		flower \
		src_mac e4:1d:2d:fd:8b:01 \
		dst_mac e4:1d:2d:fd:8b:02 \
		indev ens4f0 \
		ip_fragment yes\
	action drop

I don't care whether it's "ip_fragment yes/no", "ip_fragment 1/0",
"ip_fragment/noip_fragment" or similar. The important thing is it's a
boolean flag; if specified, it's set to 0/1 and unmasked, if not
specified, it's wildcarded.

Stephen, I understand that you already applied this patch but given how
horrible the proposed API is and that's even undocumented in this
patch, please reconsider this. If this is released, the API is set in
stone and, frankly, it's very user unfriendly this way.

Paul, could you please prepare a patch that would introduce a more sane
API? I'd strongly prefer what I described under "third" but should you
strongly disagree, at least implement "second" and document the
currently known flag values.

Thanks,

 Jiri
Paul Blakey Jan. 3, 2017, 11:54 a.m. UTC | #3
On 02/01/2017 20:55, Jiri Benc wrote:
> On Wed, 28 Dec 2016 15:06:49 +0200, Paul Blakey wrote:
>> Enhance flower to support matching on flags.
>>
>> The 1st flag allows to match on whether the packet is
>> an IP fragment.
>>
>> Example:
>>
>> 	# add a flower filter that will drop fragmented packets
>> 	# (bit 0 of control flags)
>> 	tc filter add dev ens4f0 protocol ip parent ffff: \
>> 		flower \
>> 		src_mac e4:1d:2d:fd:8b:01 \
>> 		dst_mac e4:1d:2d:fd:8b:02 \
>> 		indev ens4f0 \
>> 		matching_flags 0x1/0x1 \
>> 	action drop
> This is very poor API. First, how is the user supposed to know what
> those magic values in "matching_flags" mean? At the very least, it
> should be documented in the man page.
>
> Second, why "matching_flags"? That name suggests that those modify the
> way the matching is done (to illustrate my point, I'd expect things
> like "if the packet is too short, match this rule anyway" to be a
> "matching flag"). But this is not the case. What's wrong with plain
> "flags"? Or, if you want to be more specific, perhaps packet_flags?
>
> Third, all of this looks very wrong anyway. There should be separate
> keywords for individual flags. In this case, there should be an
> "ip_fragment" flag. The tc tool should be responsible for putting the
> flags together and creating the appropriate mask. The example would
> then be:
>
> 	tc filter add dev ens4f0 protocol ip parent ffff: \
> 		flower \
> 		src_mac e4:1d:2d:fd:8b:01 \
> 		dst_mac e4:1d:2d:fd:8b:02 \
> 		indev ens4f0 \
> 		ip_fragment yes\
> 	action drop
>
> I don't care whether it's "ip_fragment yes/no", "ip_fragment 1/0",
> "ip_fragment/noip_fragment" or similar. The important thing is it's a
> boolean flag; if specified, it's set to 0/1 and unmasked, if not
> specified, it's wildcarded.
>
> Stephen, I understand that you already applied this patch but given how
> horrible the proposed API is and that's even undocumented in this
> patch, please reconsider this. If this is released, the API is set in
> stone and, frankly, it's very user unfriendly this way.
>
> Paul, could you please prepare a patch that would introduce a more sane
> API? I'd strongly prefer what I described under "third" but should you
> strongly disagree, at least implement "second" and document the
> currently known flag values.
>
> Thanks,
>
>   Jiri

Matching name was from the idea that we are doing is matching.
And regarding documentation/flag names I didn't want tc tool to be need 
of a update each time a new flag is introduced,
But I guess I can add two options like with ip_proto where you can 
specify known flags by name but can also give a value.
What do you think about that?

flags <FLAGS> / <HEX'/'HEX>
FLAGS => frag/no_frag/tcp_syn/no_tcp_syn ['|'<FLAGS>]*
e.g: flags frag|no_tcp_syn or flags 0x01/0x15
and the mask will have a on bits corresponds only to those flags specified.
Jiri Benc Jan. 3, 2017, 12:05 p.m. UTC | #4
On Tue, 3 Jan 2017 13:54:34 +0200, Paul Blakey wrote:
> Matching name was from the idea that we are doing is matching.

But we don't have matching_src_mac etc., either, although we're
matching on those fields.

> And regarding documentation/flag names I didn't want tc tool to be need 
> of a update each time a new flag is introduced,

It will be needed anyway because the whole thing would be useless
without proper documentation. So each time a new flag is added, a new
patch to the tc tool will be needed, at least with an addition to its
man page.

Please, let's focus on the *user*. The tc tool is hard to grasp for
users as it is. It's crystal clear to you but you know the kernel
internals. I'm very sure that except for the few kernel developers, no
one would understand what the "flags" field does. And even among the
kernel developers, very few would remember what the magic numeric
values mean.

If we want wider adoption of flower, we should make it as easy to use
as possible. Even when it means a bit more work for us.

> But I guess I can add two options like with ip_proto where you can 
> specify known flags by name but can also give a value.
> What do you think about that?
> 
> flags <FLAGS> / <HEX'/'HEX>
> FLAGS => frag/no_frag/tcp_syn/no_tcp_syn ['|'<FLAGS>]*
> e.g: flags frag|no_tcp_syn or flags 0x01/0x15
> and the mask will have a on bits corresponds only to those flags specified.

This works for me, too.

Thanks!

 Jiri
Simon Horman Jan. 4, 2017, 10:33 a.m. UTC | #5
On Tue, Jan 03, 2017 at 01:54:34PM +0200, Paul Blakey wrote:

...

Hi Paul,

> Matching name was from the idea that we are doing is matching.
> And regarding documentation/flag names I didn't want tc tool to be need of a
> update each time a new flag is introduced,
> But I guess I can add two options like with ip_proto where you can specify
> known flags by name but can also give a value.
> What do you think about that?
> 
> flags <FLAGS> / <HEX'/'HEX>
> FLAGS => frag/no_frag/tcp_syn/no_tcp_syn ['|'<FLAGS>]*
> e.g: flags frag|no_tcp_syn or flags 0x01/0x15
> and the mask will have a on bits corresponds only to those flags specified.

I suppose a flag is a flag and bitwise masking allows arbitrary matching
on one or more flags. But I wonder if, as per your example above,
it makes sense to mix IP (frag) and TCP flags in the same field of the
classifier.
Paul Blakey Jan. 4, 2017, 11:51 a.m. UTC | #6
On 04/01/2017 12:33, Simon Horman wrote:
> On Tue, Jan 03, 2017 at 01:54:34PM +0200, Paul Blakey wrote:
>
> ...
>
> Hi Paul,
>
>> Matching name was from the idea that we are doing is matching.
>> And regarding documentation/flag names I didn't want tc tool to be need of a
>> update each time a new flag is introduced,
>> But I guess I can add two options like with ip_proto where you can specify
>> known flags by name but can also give a value.
>> What do you think about that?
>>
>> flags <FLAGS> / <HEX'/'HEX>
>> FLAGS => frag/no_frag/tcp_syn/no_tcp_syn ['|'<FLAGS>]*
>> e.g: flags frag|no_tcp_syn or flags 0x01/0x15
>> and the mask will have a on bits corresponds only to those flags specified.
> I suppose a flag is a flag and bitwise masking allows arbitrary matching
> on one or more flags. But I wonder if, as per your example above,
> it makes sense to mix IP (frag) and TCP flags in the same field of the
> classifier.
It mimics the kernel packing of flags, I have no problem either way 
(flags, or ip_flags/tcp_flags pairs), what do you think jiri?
Jiri Benc Jan. 4, 2017, 11:55 a.m. UTC | #7
On Wed, 4 Jan 2017 13:51:13 +0200, Paul Blakey wrote:
> It mimics the kernel packing of flags, I have no problem either way 
> (flags, or ip_flags/tcp_flags pairs), what do you think jiri?

What Simon says makes sense to me. ip_flags and tcp_flags sounds like
the best solution so far (even better than my original suggestion).

Thanks,

 Jiri
Jiri Benc Jan. 18, 2017, 12:41 p.m. UTC | #8
On Wed, 4 Jan 2017 12:55:59 +0100, Jiri Benc wrote:
> On Wed, 4 Jan 2017 13:51:13 +0200, Paul Blakey wrote:
> > It mimics the kernel packing of flags, I have no problem either way 
> > (flags, or ip_flags/tcp_flags pairs), what do you think jiri?
> 
> What Simon says makes sense to me. ip_flags and tcp_flags sounds like
> the best solution so far (even better than my original suggestion).

Is there any progress with the follow up patch? I don't think we want
iproute2 with the magic numbers to be released.

Thanks,

 Jiri
Paul Blakey Jan. 19, 2017, 11:39 a.m. UTC | #9
On 18/01/2017 14:41, Jiri Benc wrote:
> On Wed, 4 Jan 2017 12:55:59 +0100, Jiri Benc wrote:
>> On Wed, 4 Jan 2017 13:51:13 +0200, Paul Blakey wrote:
>>> It mimics the kernel packing of flags, I have no problem either way
>>> (flags, or ip_flags/tcp_flags pairs), what do you think jiri?
>>
>> What Simon says makes sense to me. ip_flags and tcp_flags sounds like
>> the best solution so far (even better than my original suggestion).
>
> Is there any progress with the follow up patch? I don't think we want
> iproute2 with the magic numbers to be released.
>
> Thanks,
>
>  Jiri
>



Hi,
I've posted a patch:
"[PATCH iproute2 net-next] tc: flower: Refactor matching flags to be 
more user friendly"

Thanks,
Paul.
diff mbox

Patch

diff --git a/man/man8/tc-flower.8 b/man/man8/tc-flower.8
index 90fdfba..40117a9 100644
--- a/man/man8/tc-flower.8
+++ b/man/man8/tc-flower.8
@@ -39,6 +39,8 @@  flower \- flow based traffic control filter
 .IR KEY-ID " | {"
 .BR enc_dst_ip " | " enc_src_ip " } { "
 .IR ipv4_address " | " ipv6_address " } | "
+.B matching_flags
+.IR MATCHING-FLAGS " }"
 .SH DESCRIPTION
 The
 .B flower
@@ -134,6 +136,15 @@  Match on IP tunnel metadata. Key id
 is a 32 bit tunnel key id (e.g. VNI for VXLAN tunnel).
 .I ADDRESS
 must be a valid IPv4 or IPv6 address.
+.TP
+.BI matching_flags " MATCHING-FLAGS"
+Match on various dissector flags.
+.I MATCHING-FLAGS
+may be specified with or without a mask:
+.BR FLAGS
+or
+.BR FLAGS/FLAGS_MASK
+where each arg is an unsigned 32bit value in hexadecimal format.
 .SH NOTES
 As stated above where applicable, matches of a certain layer implicitly depend
 on the matches of the next lower layer. Precisely, layer one and two matches
diff --git a/tc/f_flower.c b/tc/f_flower.c
index 5dac427..479f5e6 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -56,7 +56,8 @@  static void explain(void)
 		"                       code ICMP-CODE }\n"
 		"                       enc_dst_ip [ IPV4-ADDR | IPV6-ADDR ] |\n"
 		"                       enc_src_ip [ IPV4-ADDR | IPV6-ADDR ] |\n"
-		"                       enc_key_id [ KEY-ID ] }\n"
+		"                       enc_key_id [ KEY-ID ] |\n"
+		"                       matching_flags MATCHING-FLAGS }\n"
 		"       FILTERID := X:Y:Z\n"
 		"       ACTION-SPEC := ... look at individual actions\n"
 		"\n"
@@ -99,6 +100,31 @@  static int flower_parse_vlan_eth_type(char *str, __be16 eth_type, int type,
 	return 0;
 }
 
+static int flower_parse_matching_flags(char *str, int type, int mask_type,
+				       struct nlmsghdr *n)
+{
+	__u32 mtf, mtf_mask;
+	char *c;
+
+	c = strchr(str, '/');
+	if (c)
+		*c = '\0';
+
+	if (get_u32(&mtf, str, 0))
+		return -1;
+
+	if (c) {
+		if (get_u32(&mtf_mask, ++c, 0))
+			return -1;
+	} else {
+		mtf_mask = 0xffffffff;
+	}
+
+	addattr32(n, MAX_MSG, type, htonl(mtf));
+	addattr32(n, MAX_MSG, mask_type, htonl(mtf_mask));
+	return 0;
+}
+
 static int flower_parse_ip_proto(char *str, __be16 eth_type, int type,
 				 __u8 *p_ip_proto, struct nlmsghdr *n)
 {
@@ -314,6 +340,16 @@  static int flower_parse_opt(struct filter_util *qu, char *handle,
 				return -1;
 			}
 			addattr_l(n, MAX_MSG, TCA_FLOWER_CLASSID, &handle, 4);
+		} else if (matches(*argv, "matching_flags") == 0) {
+			NEXT_ARG();
+			ret = flower_parse_matching_flags(*argv,
+							  TCA_FLOWER_KEY_FLAGS,
+							  TCA_FLOWER_KEY_FLAGS_MASK,
+							  n);
+			if (ret < 0) {
+				fprintf(stderr, "Illegal \"matching_flags\"\n");
+				return -1;
+			}
 		} else if (matches(*argv, "skip_hw") == 0) {
 			flags |= TCA_CLS_FLAGS_SKIP_HW;
 		} else if (matches(*argv, "skip_sw") == 0) {
@@ -604,6 +640,17 @@  static void flower_print_ip_proto(FILE *f, __u8 *p_ip_proto,
 	*p_ip_proto = ip_proto;
 }
 
+static void flower_print_matching_flags(FILE *f, char *name,
+					struct rtattr *attr,
+					struct rtattr *mask_attr)
+{
+	if (!mask_attr || RTA_PAYLOAD(mask_attr) != 4)
+		return;
+
+	fprintf(f, "\n  %s 0x%08x/0x%08x", name, ntohl(rta_getattr_u32(attr)),
+		mask_attr ? ntohl(rta_getattr_u32(mask_attr)) : 0xffffffff);
+}
+
 static void flower_print_ip_addr(FILE *f, char *name, __be16 eth_type,
 				 struct rtattr *addr4_attr,
 				 struct rtattr *mask4_attr,
@@ -754,6 +801,10 @@  static int flower_print_opt(struct filter_util *qu, FILE *f,
 	flower_print_key_id(f, "enc_key_id",
 			    tb[TCA_FLOWER_KEY_ENC_KEY_ID]);
 
+	flower_print_matching_flags(f, "matching_flags",
+				    tb[TCA_FLOWER_KEY_FLAGS],
+				    tb[TCA_FLOWER_KEY_FLAGS_MASK]);
+
 	if (tb[TCA_FLOWER_FLAGS]) {
 		__u32 flags = rta_getattr_u32(tb[TCA_FLOWER_FLAGS]);