diff mbox series

ipset: Update byte and packet counters regardless of whether they match

Message ID f4b0ae68661c865c3083d2fa896e9a112057a82f.1582566351.git.sbrivio@redhat.com
State Changes Requested
Delegated to: Jozsef Kadlecsik
Headers show
Series ipset: Update byte and packet counters regardless of whether they match | expand

Commit Message

Stefano Brivio Feb. 24, 2020, 5:52 p.m. UTC
In ip_set_match_extensions(), for sets with counters, we take care of
updating counters themselves by calling ip_set_update_counter(), and of
checking if the given comparison and values match, by calling
ip_set_match_counter() if needed.

However, if a given comparison on counters doesn't match the configured
values, that doesn't mean the set entry itself isn't matching.

This fix restores the behaviour we had before commit 4750005a85f7
("netfilter: ipset: Fix "don't update counters" mode when counters used
at the matching"), without reintroducing the issue fixed there: back
then, mtype_data_match() first updated counters in any case, and then
took care of matching on counters.

Now, if the IPSET_FLAG_SKIP_COUNTER_UPDATE flag is set,
ip_set_update_counter() will anyway skip counter updates if desired.

The issue observed is illustrated by this reproducer:

  ipset create c hash:ip counters
  ipset add c 192.0.2.1
  iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP

if we now send packets from 192.0.2.1, bytes and packets counters
for the entry as shown by 'ipset list' are always zero, and, no
matter how many bytes we send, the rule will never match, because
counters themselves are not updated.

Reported-by: Mithil Mhatre <mmhatre@redhat.com>
Fixes: 4750005a85f7 ("netfilter: ipset: Fix "don't update counters" mode when counters used at the matching")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/netfilter/ipset/ip_set_core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jozsef Kadlecsik Feb. 25, 2020, 8:07 a.m. UTC | #1
Hi Stefano MithilMithil,

On Mon, 24 Feb 2020, Stefano Brivio wrote:

> In ip_set_match_extensions(), for sets with counters, we take care of 
> updating counters themselves by calling ip_set_update_counter(), and of 
> checking if the given comparison and values match, by calling 
> ip_set_match_counter() if needed.
> 
> However, if a given comparison on counters doesn't match the configured 
> values, that doesn't mean the set entry itself isn't matching.
> 
> This fix restores the behaviour we had before commit 4750005a85f7 
> ("netfilter: ipset: Fix "don't update counters" mode when counters used 
> at the matching"), without reintroducing the issue fixed there: back 
> then, mtype_data_match() first updated counters in any case, and then 
> took care of matching on counters.
> 
> Now, if the IPSET_FLAG_SKIP_COUNTER_UPDATE flag is set,
> ip_set_update_counter() will anyway skip counter updates if desired.
> 
> The issue observed is illustrated by this reproducer:
> 
>   ipset create c hash:ip counters
>   ipset add c 192.0.2.1
>   iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> 
> if we now send packets from 192.0.2.1, bytes and packets counters
> for the entry as shown by 'ipset list' are always zero, and, no
> matter how many bytes we send, the rule will never match, because
> counters themselves are not updated.

Sorry, but I disagree. ipset behaves the same as iptables itself: the 
counters are increased when the whole rule matches and that includes the 
counter comparison as well. I think it's less counter-intuitive that one 
can create never matching rules than to explain that "counter matching is 
a non-match for the point of view of 'when the rule matches, update the 
counter'".

What's really missing is a decrement-counters flag: that way one could 
store different "quotas" for the elements in a set.

Best regards,
Jozsef
 
> Reported-by: Mithil Mhatre <mmhatre@redhat.com>
> Fixes: 4750005a85f7 ("netfilter: ipset: Fix "don't update counters" mode when counters used at the matching")
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  net/netfilter/ipset/ip_set_core.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
> index 69c107f9ba8d..b140e38d9333 100644
> --- a/net/netfilter/ipset/ip_set_core.c
> +++ b/net/netfilter/ipset/ip_set_core.c
> @@ -649,13 +649,14 @@ ip_set_match_extensions(struct ip_set *set, const struct ip_set_ext *ext,
>  	if (SET_WITH_COUNTER(set)) {
>  		struct ip_set_counter *counter = ext_counter(data, set);
>  
> +		ip_set_update_counter(counter, ext, flags);
> +
>  		if (flags & IPSET_FLAG_MATCH_COUNTERS &&
>  		    !(ip_set_match_counter(ip_set_get_packets(counter),
>  				mext->packets, mext->packets_op) &&
>  		      ip_set_match_counter(ip_set_get_bytes(counter),
>  				mext->bytes, mext->bytes_op)))
>  			return false;
> -		ip_set_update_counter(counter, ext, flags);
>  	}
>  	if (SET_WITH_SKBINFO(set))
>  		ip_set_get_skbinfo(ext_skbinfo(data, set),
> -- 
> 2.25.0
> 
> 

-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary
Stefano Brivio Feb. 25, 2020, 8:40 a.m. UTC | #2
Hi Jozsef,

On Tue, 25 Feb 2020 09:07:09 +0100 (CET)
Jozsef Kadlecsik <kadlec@netfilter.org> wrote:

> Hi Stefano MithilMithil,
> 
> On Mon, 24 Feb 2020, Stefano Brivio wrote:
> 
> > In ip_set_match_extensions(), for sets with counters, we take care of 
> > updating counters themselves by calling ip_set_update_counter(), and of 
> > checking if the given comparison and values match, by calling 
> > ip_set_match_counter() if needed.
> > 
> > However, if a given comparison on counters doesn't match the configured 
> > values, that doesn't mean the set entry itself isn't matching.
> > 
> > This fix restores the behaviour we had before commit 4750005a85f7 
> > ("netfilter: ipset: Fix "don't update counters" mode when counters used 
> > at the matching"), without reintroducing the issue fixed there: back 
> > then, mtype_data_match() first updated counters in any case, and then 
> > took care of matching on counters.
> > 
> > Now, if the IPSET_FLAG_SKIP_COUNTER_UPDATE flag is set,
> > ip_set_update_counter() will anyway skip counter updates if desired.
> > 
> > The issue observed is illustrated by this reproducer:
> > 
> >   ipset create c hash:ip counters
> >   ipset add c 192.0.2.1
> >   iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > 
> > if we now send packets from 192.0.2.1, bytes and packets counters
> > for the entry as shown by 'ipset list' are always zero, and, no
> > matter how many bytes we send, the rule will never match, because
> > counters themselves are not updated.  
> 
> Sorry, but I disagree. ipset behaves the same as iptables itself: the 
> counters are increased when the whole rule matches and that includes the 
> counter comparison as well. I think it's less counter-intuitive that one 
> can create never matching rules than to explain that "counter matching is 
> a non-match for the point of view of 'when the rule matches, update the 
> counter'".

Note that this behaviour was modified two years ago: earlier, this was
not the case (and by the way this is how we found out, as it broke a
user setup).

Other than this, I'm a bit confused. How could --packets-gt and
--bytes-gt be used, if counters don't increase as long as the rule
doesn't match?

> What's really missing is a decrement-counters flag: that way one could 
> store different "quotas" for the elements in a set.

I see, that would work as well.
Jozsef Kadlecsik Feb. 25, 2020, 9:16 a.m. UTC | #3
Hi Stefano,

On Tue, 25 Feb 2020, Stefano Brivio wrote:

> > > The issue observed is illustrated by this reproducer:
> > > 
> > >   ipset create c hash:ip counters
> > >   ipset add c 192.0.2.1
> > >   iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > 
> > > if we now send packets from 192.0.2.1, bytes and packets counters
> > > for the entry as shown by 'ipset list' are always zero, and, no
> > > matter how many bytes we send, the rule will never match, because
> > > counters themselves are not updated.  
> > 
> > Sorry, but I disagree. ipset behaves the same as iptables itself: the 
> > counters are increased when the whole rule matches and that includes the 
> > counter comparison as well. I think it's less counter-intuitive that one 
> > can create never matching rules than to explain that "counter matching is 
> > a non-match for the point of view of 'when the rule matches, update the 
> > counter'".
> 
> Note that this behaviour was modified two years ago: earlier, this was 
> not the case (and by the way this is how we found out, as it broke a 
> user setup).

That's  really bad. Still, I think it was a bug earlier which was 
then fixed. The logic could be changed in the user rules from

iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP

to

iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
[ otherwise DROP ]

but of course it might be not so simple, depending on how the rules are 
built up.

> Other than this, I'm a bit confused. How could --packets-gt and
> --bytes-gt be used, if counters don't increase as long as the rule
> doesn't match?

I almost added to my previous mail that the 'ge' and 'gt' matches are not 
really useful at the moment...
 
> > What's really missing is a decrement-counters flag: that way one could 
> > store different "quotas" for the elements in a set.
> 
> I see, that would work as well.

The other possibility is to force counter update. I.e. instead of

iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP

something like

iptables -I INPUT -m set --match-set c src --update-counters \
	--bytes-gt 800 -j DROP

but that also requires some internal changes to store a new flag, because 
at the moment only "! --update-counters" is supported. So there'd be then 
a fine-grained control over how the counters are updated:

- no --update-counters flag: update counters only if the whole rule 
  matches, including the counter matches
- --update-counters flag: update counters if counter matching is false
- ! --update-counters flag: don't update counters

Best regards,
Jozsef 
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary
Stefano Brivio Feb. 25, 2020, 12:22 p.m. UTC | #4
On Tue, 25 Feb 2020 10:16:40 +0100 (CET)
Jozsef Kadlecsik <kadlec@netfilter.org> wrote:

> Hi Stefano,
> 
> On Tue, 25 Feb 2020, Stefano Brivio wrote:
> 
> > > > The issue observed is illustrated by this reproducer:
> > > > 
> > > >   ipset create c hash:ip counters
> > > >   ipset add c 192.0.2.1
> > > >   iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > 
> > > > if we now send packets from 192.0.2.1, bytes and packets counters
> > > > for the entry as shown by 'ipset list' are always zero, and, no
> > > > matter how many bytes we send, the rule will never match, because
> > > > counters themselves are not updated.    
> > > 
> > > Sorry, but I disagree. ipset behaves the same as iptables itself: the 
> > > counters are increased when the whole rule matches and that includes the 
> > > counter comparison as well. I think it's less counter-intuitive that one 
> > > can create never matching rules than to explain that "counter matching is 
> > > a non-match for the point of view of 'when the rule matches, update the 
> > > counter'".  
> > 
> > Note that this behaviour was modified two years ago: earlier, this was 
> > not the case (and by the way this is how we found out, as it broke a 
> > user setup).  
> 
> That's  really bad. Still, I think it was a bug earlier which was 
> then fixed. The logic could be changed in the user rules from
> 
> iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> 
> to
> 
> iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> [ otherwise DROP ]
> 
> but of course it might be not so simple, depending on how the rules are 
> built up.

Yes, it would work, unless the user actually wants to check with the
same counter how many bytes are sent "in excess".

Now, I see the conceptual problem about matching: if the rule isn't
matching, and counters count matched packets, counters shouldn't
increase. But still, I think there are a number of facts to
be considered:

- the man page says (and has said for a number of years):

	If the packet is matched an element in the set, match only if
	the byte counter of the element is greater than the given value
	as well.

  which actually makes the problem undecidable: matching depends on
  matching itself. Trying some "common sense" interpretation, I would
  read this as:

	If the packet matches an *element* in the set, this *rule* will
	match only if the byte counter of the element is greater than
	the given value.

  that is, by separating the meaning of "element matching" from "rule
  matching", this starts making sense.

- I spent the past two hours trying to think of an actual case that was
  affected by 4750005a85f7, *other than the "main" bug it fixes*, that
  is, "! --update-counters" was ignored altogether, and I couldn't.

  Even if we had a --bytes-lt option, it would be counter-intuitive,
  because the counter would be updated until bytes are less than the
  threshold, and then the rule would stop matching, meaning that the
  user most probably thinks:

	"Drop matching packets as long as less than 800 bytes are sent"

  and what happens is:

	"Count and drop matching packets until 800 bytes are sent, then
	stop dropping and counting them"

  The only "functional" case I can think of is something like
  --bytes-lt 800 -j ACCEPT. User probably thinks:

	"Don't let more than 800 bytes go through"

  and what happens is:

	"Let up to 800 bytes, or 799 bytes plus one packet, go through,
	counting the bytes in packets that were let through"

  which isn't much different from the expectation.

- and then,

> > Other than this, I'm a bit confused. How could --packets-gt and
> > --bytes-gt be used, if counters don't increase as long as the rule
> > doesn't match?  
> 
> I almost added to my previous mail that the 'ge' and 'gt' matches are not 
> really useful at the moment...

...yes, I can't think of any other use for those either.

> > > What's really missing is a decrement-counters flag: that way one could 
> > > store different "quotas" for the elements in a set.  
> > 
> > I see, that would work as well.  
> 
> The other possibility is to force counter update. I.e. instead of
> 
> iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> 
> something like
> 
> iptables -I INPUT -m set --match-set c src --update-counters \
> 	--bytes-gt 800 -j DROP
> 
> but that also requires some internal changes to store a new flag, because 
> at the moment only "! --update-counters" is supported. So there'd be then 
> a fine-grained control over how the counters are updated:
> 
> - no --update-counters flag: update counters only if the whole rule 
>   matches, including the counter matches
> - --update-counters flag: update counters if counter matching is false

...this should probably be "in any case", also if it's true.

> - ! --update-counters flag: don't update counters

I think that would fix the issue as well, I'm just struggling to find a
sensible use case for the "no --update-counters" case -- especially one
where there would be a substantial issue with the change I proposed.
Jozsef Kadlecsik Feb. 25, 2020, 8:37 p.m. UTC | #5
On Tue, 25 Feb 2020, Stefano Brivio wrote:

> > The logic could be changed in the user rules from
> > 
> > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > 
> > to
> > 
> > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > [ otherwise DROP ]
> > 
> > but of course it might be not so simple, depending on how the rules are 
> > built up.
> 
> Yes, it would work, unless the user actually wants to check with the
> same counter how many bytes are sent "in excess".

You mean the counters are still updated whenever the element is matched in 
the set and then one could check how many bytes were sent over the 
threshold just by listing the set elements.

> Now, I see the conceptual problem about matching: if the rule isn't 
> matching, and counters count matched packets, counters shouldn't 
> increase. But still, I think there are a number of facts to be 
> considered:
> 
> - the man page says (and has said for a number of years):
> 
> 	If the packet is matched an element in the set, match only if
> 	the byte counter of the element is greater than the given value
> 	as well.
> 
>   which actually makes the problem undecidable: matching depends on
>   matching itself. Trying some "common sense" interpretation, I would
>   read this as:
> 
> 	If the packet matches an *element* in the set, this *rule* will
> 	match only if the byte counter of the element is greater than
> 	the given value.
> 
>   that is, by separating the meaning of "element matching" from "rule
>   matching", this starts making sense.

Yes, you are right. Sometimes I think I'm far from the best at writing 
documentation... So I'm going to update the manpage with your sentence.
 
> - I spent the past two hours trying to think of an actual case that was
>   affected by 4750005a85f7, *other than the "main" bug it fixes*, that
>   is, "! --update-counters" was ignored altogether, and I couldn't.
> 
>   Even if we had a --bytes-lt option, it would be counter-intuitive,
>   because the counter would be updated until bytes are less than the
>   threshold, and then the rule would stop matching, meaning that the
>   user most probably thinks:
> 
> 	"Drop matching packets as long as less than 800 bytes are sent"
> 
>   and what happens is:
> 
> 	"Count and drop matching packets until 800 bytes are sent, then
> 	stop dropping and counting them"

Again, yes, that's what would happen.

>   The only "functional" case I can think of is something like
>   --bytes-lt 800 -j ACCEPT. User probably thinks:
> 
> 	"Don't let more than 800 bytes go through"
> 
>   and what happens is:
> 
> 	"Let up to 800 bytes, or 799 bytes plus one packet, go through,
> 	counting the bytes in packets that were let through"
> 
>   which isn't much different from the expectation.
> 
> - and then,
> 
> > > Other than this, I'm a bit confused. How could --packets-gt and
> > > --bytes-gt be used, if counters don't increase as long as the rule
> > > doesn't match?  
> > 
> > I almost added to my previous mail that the 'ge' and 'gt' matches are not 
> > really useful at the moment...
> 
> ...yes, I can't think of any other use for those either.

Those could really be useful if the counters could be decremented. 
Otherwise I think the counter matching in the sets is not as useful as it 
seems to be.
 
> > > > What's really missing is a decrement-counters flag: that way one could 
> > > > store different "quotas" for the elements in a set.  
> > > 
> > > I see, that would work as well.  
> > 
> > The other possibility is to force counter update. I.e. instead of
> > 
> > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > 
> > something like
> > 
> > iptables -I INPUT -m set --match-set c src --update-counters \
> > 	--bytes-gt 800 -j DROP
> > 
> > but that also requires some internal changes to store a new flag, because 
> > at the moment only "! --update-counters" is supported. So there'd be then 
> > a fine-grained control over how the counters are updated:
> > 
> > - no --update-counters flag: update counters only if the whole rule 
> >   matches, including the counter matches
> > - --update-counters flag: update counters if counter matching is false
> 
> ...this should probably be "in any case", also if it's true.

Yes, but now I don't really like the name itself: --force-update-counters
or something like that would be more clear.

> > - ! --update-counters flag: don't update counters
> 
> I think that would fix the issue as well, I'm just struggling to find a
> sensible use case for the "no --update-counters" case -- especially one
> where there would be a substantial issue with the change I proposed.

The no update counter flag was introduced to handle when one needs to 
match in the same set multiple times, i.e. there are multiple rules with 
the same set. Like you need to match in the raw/mangle/filter tables as 
well. Unfortunately I can't recall the usercase.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary
Stefano Brivio Feb. 25, 2020, 8:53 p.m. UTC | #6
On Tue, 25 Feb 2020 21:37:45 +0100 (CET)
Jozsef Kadlecsik <kadlec@netfilter.org> wrote:

> On Tue, 25 Feb 2020, Stefano Brivio wrote:
> 
> > > The logic could be changed in the user rules from
> > > 
> > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > 
> > > to
> > > 
> > > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > > [ otherwise DROP ]
> > > 
> > > but of course it might be not so simple, depending on how the rules are 
> > > built up.  
> > 
> > Yes, it would work, unless the user actually wants to check with the
> > same counter how many bytes are sent "in excess".  
> 
> You mean the counters are still updated whenever the element is matched in 
> the set and then one could check how many bytes were sent over the 
> threshold just by listing the set elements.

Yes, exactly -- note that it was possible (and, I think, used) before.

> > Now, I see the conceptual problem about matching: if the rule isn't 
> > matching, and counters count matched packets, counters shouldn't 
> > increase. But still, I think there are a number of facts to be 
> > considered:
> > 
> > - the man page says (and has said for a number of years):
> > 
> > 	If the packet is matched an element in the set, match only if
> > 	the byte counter of the element is greater than the given value
> > 	as well.
> > 
> >   which actually makes the problem undecidable: matching depends on
> >   matching itself. Trying some "common sense" interpretation, I would
> >   read this as:
> > 
> > 	If the packet matches an *element* in the set, this *rule* will
> > 	match only if the byte counter of the element is greater than
> > 	the given value.
> > 
> >   that is, by separating the meaning of "element matching" from "rule
> >   matching", this starts making sense.  
> 
> Yes, you are right. Sometimes I think I'm far from the best at writing 
> documentation... So I'm going to update the manpage with your sentence.

Wait, though: that's only the case if we update the counters for
matching *elements* and not necessarily matching *rules*, which was the
case before 4750005a85f7, or with this patch.

Otherwise, the sentence I wrote is not accurate. I can try to come up
with another one to describe the current behaviour, but I'll need some
calm minutes with pencil and paper tomorrow.

> > - I spent the past two hours trying to think of an actual case that was
> >   affected by 4750005a85f7, *other than the "main" bug it fixes*, that
> >   is, "! --update-counters" was ignored altogether, and I couldn't.
> > 
> >   Even if we had a --bytes-lt option, it would be counter-intuitive,
> >   because the counter would be updated until bytes are less than the
> >   threshold, and then the rule would stop matching, meaning that the
> >   user most probably thinks:
> > 
> > 	"Drop matching packets as long as less than 800 bytes are sent"
> > 
> >   and what happens is:
> > 
> > 	"Count and drop matching packets until 800 bytes are sent, then
> > 	stop dropping and counting them"  
> 
> Again, yes, that's what would happen.
> 
> >   The only "functional" case I can think of is something like
> >   --bytes-lt 800 -j ACCEPT. User probably thinks:
> > 
> > 	"Don't let more than 800 bytes go through"
> > 
> >   and what happens is:
> > 
> > 	"Let up to 800 bytes, or 799 bytes plus one packet, go through,
> > 	counting the bytes in packets that were let through"
> > 
> >   which isn't much different from the expectation.
> > 
> > - and then,
> >   
> > > > Other than this, I'm a bit confused. How could --packets-gt and
> > > > --bytes-gt be used, if counters don't increase as long as the rule
> > > > doesn't match?    
> > > 
> > > I almost added to my previous mail that the 'ge' and 'gt' matches are not 
> > > really useful at the moment...  
> > 
> > ...yes, I can't think of any other use for those either.  
> 
> Those could really be useful if the counters could be decremented. 
> Otherwise I think the counter matching in the sets is not as useful as it 
> seems to be.

Still, if counters are updated with just matching element, but not
necessarily matching rule, they should be as useful as in the hypothesis
of introducing a "decrementing" feature -- one just needs to adjust the
rule logic to that.

> > > > > What's really missing is a decrement-counters flag: that way one could 
> > > > > store different "quotas" for the elements in a set.    
> > > > 
> > > > I see, that would work as well.    
> > > 
> > > The other possibility is to force counter update. I.e. instead of
> > > 
> > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > 
> > > something like
> > > 
> > > iptables -I INPUT -m set --match-set c src --update-counters \
> > > 	--bytes-gt 800 -j DROP
> > > 
> > > but that also requires some internal changes to store a new flag, because 
> > > at the moment only "! --update-counters" is supported. So there'd be then 
> > > a fine-grained control over how the counters are updated:
> > > 
> > > - no --update-counters flag: update counters only if the whole rule 
> > >   matches, including the counter matches
> > > - --update-counters flag: update counters if counter matching is false  
> > 
> > ...this should probably be "in any case", also if it's true.  
> 
> Yes, but now I don't really like the name itself: --force-update-counters
> or something like that would be more clear.
> 
> > > - ! --update-counters flag: don't update counters  
> > 
> > I think that would fix the issue as well, I'm just struggling to find a
> > sensible use case for the "no --update-counters" case -- especially one
> > where there would be a substantial issue with the change I proposed.  
> 
> The no update counter flag was introduced to handle when one needs to 
> match in the same set multiple times, i.e. there are multiple rules with 
> the same set. Like you need to match in the raw/mangle/filter tables as 
> well. Unfortunately I can't recall the usercase.

Okay, but what you're describing is the "! --update-counters" option.
That works, didn't work before 4750005a85f7, but would still work with
this change.

What I meant is really the case where "--update-counters" (or
"--force-update-counters") and "! --update-counters" are both absent: I
don't see any particular advantage in the current behaviour for that
case.
Jozsef Kadlecsik Feb. 27, 2020, 8:37 p.m. UTC | #7
Hi Stefano,

On Tue, 25 Feb 2020, Stefano Brivio wrote:

> On Tue, 25 Feb 2020 21:37:45 +0100 (CET)
> Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> 
> > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > 
> > > > The logic could be changed in the user rules from
> > > > 
> > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > 
> > > > to
> > > > 
> > > > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > > > [ otherwise DROP ]
> > > > 
> > > > but of course it might be not so simple, depending on how the rules are 
> > > > built up.  
> > > 
> > > Yes, it would work, unless the user actually wants to check with the
> > > same counter how many bytes are sent "in excess".  
> > 
> > You mean the counters are still updated whenever the element is matched in 
> > the set and then one could check how many bytes were sent over the 
> > threshold just by listing the set elements.
> 
> Yes, exactly -- note that it was possible (and, I think, used) before.

I'm still not really convinced about such a feature. Why is it useful to 
know how many bytes would be sent over the "limit"? Also, there's no 
protection against overflow in the counters. I know firewalls with ipset, 
10gb interfaces and long uptimes, so it's not completely a theoretical 
issue.
 
> > > > I almost added to my previous mail that the 'ge' and 'gt' matches 
> > > > are not really useful at the moment...
> > > 
> > > ...yes, I can't think of any other use for those either.  
> > 
> > Those could really be useful if the counters could be decremented. 
> > Otherwise I think the counter matching in the sets is not as useful as 
> > it seems to be.
> 
> Still, if counters are updated with just matching element, but not 
> necessarily matching rule, they should be as useful as in the hypothesis 
> of introducing a "decrementing" feature -- one just needs to adjust the 
> rule logic to that.

That's true.

> > > > The other possibility is to force counter update. I.e. instead of
> > > > 
> > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > 
> > > > something like
> > > > 
> > > > iptables -I INPUT -m set --match-set c src --update-counters \
> > > > 	--bytes-gt 800 -j DROP
> > > > 
> > > > but that also requires some internal changes to store a new flag, because 
> > > > at the moment only "! --update-counters" is supported. So there'd be then 
> > > > a fine-grained control over how the counters are updated:
> > > > 
> > > > - no --update-counters flag: update counters only if the whole rule 
> > > >   matches, including the counter matches
> > > > - --update-counters flag: update counters if counter matching is false  
> > > 
> > > ...this should probably be "in any case", also if it's true.  
> > 
> > Yes, but now I don't really like the name itself: --force-update-counters
> > or something like that would be more clear.
> > 
> > > > - ! --update-counters flag: don't update counters  
> > > 
> > > I think that would fix the issue as well, I'm just struggling to find a
> > > sensible use case for the "no --update-counters" case -- especially one
> > > where there would be a substantial issue with the change I proposed.  
> > 
> > The no update counter flag was introduced to handle when one needs to 
> > match in the same set multiple times, i.e. there are multiple rules with 
> > the same set. Like you need to match in the raw/mangle/filter tables as 
> > well. Unfortunately I can't recall the usercase.
> 
> Okay, but what you're describing is the "! --update-counters" option. 
> That works, didn't work before 4750005a85f7, but would still work with 
> this change.
> 
> What I meant is really the case where "--update-counters" (or 
> "--force-update-counters") and "! --update-counters" are both absent: I 
> don't see any particular advantage in the current behaviour for that 
> case.

The counters are used just for statistical purposes: reflect the 
packets/bytes which were let through, i.e. matched the whole "rule".
In that case updating the counters before the counter value matching is 
evaluated gives false results.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary
Stefano Brivio Feb. 28, 2020, 11:40 a.m. UTC | #8
Hi Jozsef,

On Thu, 27 Feb 2020 21:37:10 +0100 (CET)
Jozsef Kadlecsik <kadlec@netfilter.org> wrote:

> Hi Stefano,
> 
> On Tue, 25 Feb 2020, Stefano Brivio wrote:
> 
> > On Tue, 25 Feb 2020 21:37:45 +0100 (CET)
> > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> >   
> > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > >   
> > > > > The logic could be changed in the user rules from
> > > > > 
> > > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > > 
> > > > > to
> > > > > 
> > > > > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > > > > [ otherwise DROP ]
> > > > > 
> > > > > but of course it might be not so simple, depending on how the rules are 
> > > > > built up.    
> > > > 
> > > > Yes, it would work, unless the user actually wants to check with the
> > > > same counter how many bytes are sent "in excess".    
> > > 
> > > You mean the counters are still updated whenever the element is matched in 
> > > the set and then one could check how many bytes were sent over the 
> > > threshold just by listing the set elements.  
> > 
> > Yes, exactly -- note that it was possible (and, I think, used) before.  
> 
> I'm still not really convinced about such a feature. Why is it useful to 
> know how many bytes would be sent over the "limit"?

This is useful in case one wants different treatments for packets
according to a number of thresholds in different rules. For example,

    iptables -I INPUT -m set --match-set c src --bytes-lt 100 -j noise
    iptables -I noise -m set --match-set c src --bytes-lt 20000 -j download

and you want to log packets from chains 'noise' and 'download' with
different prefixes.

> Also, there's no protection against overflow in the counters. I know
> firewalls with ipset, 10gb interfaces and long uptimes, so it's not
> completely a theoretical issue.

With 10GbE, 64-bit counters can cover more than:
  2 ^ 64 / (10 * 1000 * 1000 * 1000 / 8) = 14757395259 seconds
that is,
  14757395259 / (60 * 60 * 24) = 170803 days
that is,
  170803 / 365 = 468 years

...is that a real issue?

> > > > > I almost added to my previous mail that the 'ge' and 'gt' matches 
> > > > > are not really useful at the moment...  
> > > > 
> > > > ...yes, I can't think of any other use for those either.    
> > > 
> > > Those could really be useful if the counters could be decremented. 
> > > Otherwise I think the counter matching in the sets is not as useful as 
> > > it seems to be.  
> > 
> > Still, if counters are updated with just matching element, but not 
> > necessarily matching rule, they should be as useful as in the hypothesis 
> > of introducing a "decrementing" feature -- one just needs to adjust the 
> > rule logic to that.  
> 
> That's true.
> 
> > > > > The other possibility is to force counter update. I.e. instead of
> > > > > 
> > > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > > 
> > > > > something like
> > > > > 
> > > > > iptables -I INPUT -m set --match-set c src --update-counters \
> > > > > 	--bytes-gt 800 -j DROP
> > > > > 
> > > > > but that also requires some internal changes to store a new flag, because 
> > > > > at the moment only "! --update-counters" is supported. So there'd be then 
> > > > > a fine-grained control over how the counters are updated:
> > > > > 
> > > > > - no --update-counters flag: update counters only if the whole rule 
> > > > >   matches, including the counter matches
> > > > > - --update-counters flag: update counters if counter matching is false    
> > > > 
> > > > ...this should probably be "in any case", also if it's true.    
> > > 
> > > Yes, but now I don't really like the name itself: --force-update-counters
> > > or something like that would be more clear.
> > >   
> > > > > - ! --update-counters flag: don't update counters    
> > > > 
> > > > I think that would fix the issue as well, I'm just struggling to find a
> > > > sensible use case for the "no --update-counters" case -- especially one
> > > > where there would be a substantial issue with the change I proposed.    
> > > 
> > > The no update counter flag was introduced to handle when one needs to 
> > > match in the same set multiple times, i.e. there are multiple rules with 
> > > the same set. Like you need to match in the raw/mangle/filter tables as 
> > > well. Unfortunately I can't recall the usercase.  
> > 
> > Okay, but what you're describing is the "! --update-counters" option. 
> > That works, didn't work before 4750005a85f7, but would still work with 
> > this change.
> > 
> > What I meant is really the case where "--update-counters" (or 
> > "--force-update-counters") and "! --update-counters" are both absent: I 
> > don't see any particular advantage in the current behaviour for that 
> > case.  
> 
> The counters are used just for statistical purposes: reflect the 
> packets/bytes which were let through, i.e. matched the whole "rule".
> In that case updating the counters before the counter value matching is 
> evaluated gives false results.

Well, but for that, iptables/x_tables counters are available and
(as far as I know) typically used.
Stefano Brivio Feb. 28, 2020, 12:28 p.m. UTC | #9
On Fri, 28 Feb 2020 12:40:39 +0100
Stefano Brivio <sbrivio@redhat.com> wrote:

> Hi Jozsef,
> 
> On Thu, 27 Feb 2020 21:37:10 +0100 (CET)
> Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> 
> > Hi Stefano,
> > 
> > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> >   
> > > On Tue, 25 Feb 2020 21:37:45 +0100 (CET)
> > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > >     
> > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > >     
> > > > > > The logic could be changed in the user rules from
> > > > > > 
> > > > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > > > 
> > > > > > to
> > > > > > 
> > > > > > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > > > > > [ otherwise DROP ]
> > > > > > 
> > > > > > but of course it might be not so simple, depending on how the rules are 
> > > > > > built up.      
> > > > > 
> > > > > Yes, it would work, unless the user actually wants to check with the
> > > > > same counter how many bytes are sent "in excess".      
> > > > 
> > > > You mean the counters are still updated whenever the element is matched in 
> > > > the set and then one could check how many bytes were sent over the 
> > > > threshold just by listing the set elements.    
> > > 
> > > Yes, exactly -- note that it was possible (and, I think, used) before.    
> > 
> > I'm still not really convinced about such a feature. Why is it useful to 
> > know how many bytes would be sent over the "limit"?  
> 
> This is useful in case one wants different treatments for packets
> according to a number of thresholds in different rules. For example,
> 
>     iptables -I INPUT -m set --match-set c src --bytes-lt 100 -j noise
>     iptables -I noise -m set --match-set c src --bytes-lt 20000 -j download
                                                         ^^ gt, of
                                                         course :)
Jozsef Kadlecsik March 3, 2020, 9:36 a.m. UTC | #10
Hi Stefano,

On Fri, 28 Feb 2020, Stefano Brivio wrote:

> On Thu, 27 Feb 2020 21:37:10 +0100 (CET)
> Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> 
> > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > 
> > > On Tue, 25 Feb 2020 21:37:45 +0100 (CET)
> > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > >   
> > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > >   
> > > > > > The logic could be changed in the user rules from
> > > > > > 
> > > > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > > > 
> > > > > > to
> > > > > > 
> > > > > > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > > > > > [ otherwise DROP ]
> > > > > > 
> > > > > > but of course it might be not so simple, depending on how the rules are 
> > > > > > built up.    
> > > > > 
> > > > > Yes, it would work, unless the user actually wants to check with the
> > > > > same counter how many bytes are sent "in excess".    
> > > > 
> > > > You mean the counters are still updated whenever the element is matched in 
> > > > the set and then one could check how many bytes were sent over the 
> > > > threshold just by listing the set elements.  
> > > 
> > > Yes, exactly -- note that it was possible (and, I think, used) before.  
> > 
> > I'm still not really convinced about such a feature. Why is it useful to 
> > know how many bytes would be sent over the "limit"?
> 
> This is useful in case one wants different treatments for packets
> according to a number of thresholds in different rules. For example,
> 
>     iptables -I INPUT -m set --match-set c src --bytes-lt 100 -j noise
>     iptables -I noise -m set --match-set c src --bytes-lt 20000 -j download
> 
> and you want to log packets from chains 'noise' and 'download' with
> different prefixes.

What do you think about this patch?

diff --git a/kernel/include/uapi/linux/netfilter/ipset/ip_set.h b/kernel/include/uapi/linux/netfilter/ipset/ip_set.h
index 7545af4..6881329 100644
--- a/kernel/include/uapi/linux/netfilter/ipset/ip_set.h
+++ b/kernel/include/uapi/linux/netfilter/ipset/ip_set.h
@@ -186,6 +186,9 @@ enum ipset_cmd_flags {
 	IPSET_FLAG_MAP_SKBPRIO = (1 << IPSET_FLAG_BIT_MAP_SKBPRIO),
 	IPSET_FLAG_BIT_MAP_SKBQUEUE = 10,
 	IPSET_FLAG_MAP_SKBQUEUE = (1 << IPSET_FLAG_BIT_MAP_SKBQUEUE),
+	IPSET_FLAG_BIT_UPDATE_COUNTERS_FIRST = 11,
+	IPSET_FLAG_UPDATE_COUNTERS_FIRST =
+		(1 << IPSET_FLAG_BIT_UPDATE_COUNTERS_FIRST),
 	IPSET_FLAG_CMD_MAX = 15,
 };
 
diff --git a/kernel/net/netfilter/ipset/ip_set_core.c b/kernel/net/netfilter/ipset/ip_set_core.c
index 1df6536..423d0de 100644
--- a/kernel/net/netfilter/ipset/ip_set_core.c
+++ b/kernel/net/netfilter/ipset/ip_set_core.c
@@ -622,10 +622,9 @@ ip_set_add_packets(u64 packets, struct ip_set_counter *counter)
 
 static void
 ip_set_update_counter(struct ip_set_counter *counter,
-		      const struct ip_set_ext *ext, u32 flags)
+		      const struct ip_set_ext *ext)
 {
-	if (ext->packets != ULLONG_MAX &&
-	    !(flags & IPSET_FLAG_SKIP_COUNTER_UPDATE)) {
+	if (ext->packets != ULLONG_MAX) {
 		ip_set_add_bytes(ext->bytes, counter);
 		ip_set_add_packets(ext->packets, counter);
 	}
@@ -649,13 +648,19 @@ ip_set_match_extensions(struct ip_set *set, const struct ip_set_ext *ext,
 	if (SET_WITH_COUNTER(set)) {
 		struct ip_set_counter *counter = ext_counter(data, set);
 
+		if (flags & IPSET_FLAG_UPDATE_COUNTERS_FIRST)
+			ip_set_update_counter(counter, ext);
+
 		if (flags & IPSET_FLAG_MATCH_COUNTERS &&
 		    !(ip_set_match_counter(ip_set_get_packets(counter),
 				mext->packets, mext->packets_op) &&
 		      ip_set_match_counter(ip_set_get_bytes(counter),
 				mext->bytes, mext->bytes_op)))
 			return false;
-		ip_set_update_counter(counter, ext, flags);
+
+		if (!(flags & (IPSET_FLAG_UPDATE_COUNTERS_FIRST|
+			       IPSET_FLAG_SKIP_COUNTER_UPDATE)))
+			ip_set_update_counter(counter, ext);
 	}
 	if (SET_WITH_SKBINFO(set))
 		ip_set_get_skbinfo(ext_skbinfo(data, set),

Then the rules above would look like

... -m set ... --update-counters-first --bytes-lt 100 -j noise
... -m set ... --update-counters-first --bytes-ge 100 -j download
 
> > > What I meant is really the case where "--update-counters" (or 
> > > "--force-update-counters") and "! --update-counters" are both 
> > > absent: I don't see any particular advantage in the current 
> > > behaviour for that case.
> > 
> > The counters are used just for statistical purposes: reflect the 
> > packets/bytes which were let through, i.e. matched the whole "rule". 
> > In that case updating the counters before the counter value matching 
> > is evaluated gives false results.
> 
> Well, but for that, iptables/x_tables counters are available and (as far 
> as I know) typically used.

With "rules" I meant at ipset level (match element + packet/byte counters 
as specified), i.e. counters for statistical purposes per set elements 
level.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary
Stefano Brivio March 3, 2020, 10:16 p.m. UTC | #11
Hi József,

On Tue, 3 Mar 2020 10:36:53 +0100 (CET)
Jozsef Kadlecsik <kadlec@netfilter.org> wrote:

> Hi Stefano,
> 
> On Fri, 28 Feb 2020, Stefano Brivio wrote:
> 
> > On Thu, 27 Feb 2020 21:37:10 +0100 (CET)
> > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> >   
> > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > >   
> > > > On Tue, 25 Feb 2020 21:37:45 +0100 (CET)
> > > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > >     
> > > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > > >     
> > > > > > > The logic could be changed in the user rules from
> > > > > > > 
> > > > > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > > > > 
> > > > > > > to
> > > > > > > 
> > > > > > > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > > > > > > [ otherwise DROP ]
> > > > > > > 
> > > > > > > but of course it might be not so simple, depending on how the rules are 
> > > > > > > built up.      
> > > > > > 
> > > > > > Yes, it would work, unless the user actually wants to check with the
> > > > > > same counter how many bytes are sent "in excess".      
> > > > > 
> > > > > You mean the counters are still updated whenever the element is matched in 
> > > > > the set and then one could check how many bytes were sent over the 
> > > > > threshold just by listing the set elements.    
> > > > 
> > > > Yes, exactly -- note that it was possible (and, I think, used) before.    
> > > 
> > > I'm still not really convinced about such a feature. Why is it useful to 
> > > know how many bytes would be sent over the "limit"?  
> > 
> > This is useful in case one wants different treatments for packets
> > according to a number of thresholds in different rules. For example,
> > 
> >     iptables -I INPUT -m set --match-set c src --bytes-lt 100 -j noise
> >     iptables -I noise -m set --match-set c src --bytes-lt 20000 -j download
> > 
> > and you want to log packets from chains 'noise' and 'download' with
> > different prefixes.  
> 
> What do you think about this patch?

Thanks, I think it gives a way to avoid the issue.

I'm still not convinced that keeping this disabled by default is the
best way to go (mostly because we had a kernel change affecting
semantics that were exported to userspace for a long time), but if
there's a need for the opposite of this option, introducing it as a
negation becomes linguistically awkward. :)

And anyway, it's surely better than not having this possibility at all.

Let me know if you want me to review (or try to draft) man page
changes. Just a few comments inline:

> diff --git a/kernel/include/uapi/linux/netfilter/ipset/ip_set.h b/kernel/include/uapi/linux/netfilter/ipset/ip_set.h
> index 7545af4..6881329 100644
> --- a/kernel/include/uapi/linux/netfilter/ipset/ip_set.h
> +++ b/kernel/include/uapi/linux/netfilter/ipset/ip_set.h
> @@ -186,6 +186,9 @@ enum ipset_cmd_flags {
>  	IPSET_FLAG_MAP_SKBPRIO = (1 << IPSET_FLAG_BIT_MAP_SKBPRIO),
>  	IPSET_FLAG_BIT_MAP_SKBQUEUE = 10,
>  	IPSET_FLAG_MAP_SKBQUEUE = (1 << IPSET_FLAG_BIT_MAP_SKBQUEUE),
> +	IPSET_FLAG_BIT_UPDATE_COUNTERS_FIRST = 11,
> +	IPSET_FLAG_UPDATE_COUNTERS_FIRST =
> +		(1 << IPSET_FLAG_BIT_UPDATE_COUNTERS_FIRST),
>  	IPSET_FLAG_CMD_MAX = 15,
>  };
>  
> diff --git a/kernel/net/netfilter/ipset/ip_set_core.c b/kernel/net/netfilter/ipset/ip_set_core.c
> index 1df6536..423d0de 100644
> --- a/kernel/net/netfilter/ipset/ip_set_core.c
> +++ b/kernel/net/netfilter/ipset/ip_set_core.c
> @@ -622,10 +622,9 @@ ip_set_add_packets(u64 packets, struct ip_set_counter *counter)
>  
>  static void
>  ip_set_update_counter(struct ip_set_counter *counter,
> -		      const struct ip_set_ext *ext, u32 flags)
> +		      const struct ip_set_ext *ext)
>  {
> -	if (ext->packets != ULLONG_MAX &&
> -	    !(flags & IPSET_FLAG_SKIP_COUNTER_UPDATE)) {
> +	if (ext->packets != ULLONG_MAX) {

This means that UPDATE_COUNTERS_FIRST wins over SKIP_COUNTER_UPDATE. Is
that intended? Intuitively, I would say that "skip" is more imperative
than "do it *first*". Anyway, I guess they will be mutually exclusive.

>  		ip_set_add_bytes(ext->bytes, counter);
>  		ip_set_add_packets(ext->packets, counter);
>  	}
> @@ -649,13 +648,19 @@ ip_set_match_extensions(struct ip_set *set, const struct ip_set_ext *ext,
>  	if (SET_WITH_COUNTER(set)) {
>  		struct ip_set_counter *counter = ext_counter(data, set);
>  
> +		if (flags & IPSET_FLAG_UPDATE_COUNTERS_FIRST)
> +			ip_set_update_counter(counter, ext);
> +
>  		if (flags & IPSET_FLAG_MATCH_COUNTERS &&
>  		    !(ip_set_match_counter(ip_set_get_packets(counter),
>  				mext->packets, mext->packets_op) &&
>  		      ip_set_match_counter(ip_set_get_bytes(counter),
>  				mext->bytes, mext->bytes_op)))
>  			return false;
> -		ip_set_update_counter(counter, ext, flags);
> +
> +		if (!(flags & (IPSET_FLAG_UPDATE_COUNTERS_FIRST|

Nit: whitespace before |.

> +			       IPSET_FLAG_SKIP_COUNTER_UPDATE)))
> +			ip_set_update_counter(counter, ext);
>  	}
>  	if (SET_WITH_SKBINFO(set))
>  		ip_set_get_skbinfo(ext_skbinfo(data, set),
> 
> Then the rules above would look like
> 
> ... -m set ... --update-counters-first --bytes-lt 100 -j noise
> ... -m set ... --update-counters-first --bytes-ge 100 -j download

Sorry for the typo in my previous example, I really meant:

  iptables -I INPUT -m set --match-set c src --bytes-gt 100 -j noise
  iptables -I noise -m set --match-set c src --bytes-gt 20000 -j download

that is, "noise" is more than "a single connection attempt", and
"download" is even more. But I think your example is equivalent for
this purpose.

> > > > What I meant is really the case where "--update-counters" (or 
> > > > "--force-update-counters") and "! --update-counters" are both 
> > > > absent: I don't see any particular advantage in the current 
> > > > behaviour for that case.  
> > > 
> > > The counters are used just for statistical purposes: reflect the 
> > > packets/bytes which were let through, i.e. matched the whole "rule". 
> > > In that case updating the counters before the counter value matching 
> > > is evaluated gives false results.  
> > 
> > Well, but for that, iptables/x_tables counters are available and (as far 
> > as I know) typically used.  
> 
> With "rules" I meant at ipset level (match element + packet/byte counters 
> as specified), i.e. counters for statistical purposes per set elements 
> level.

Ah, I see now, thanks for explaining.
Jozsef Kadlecsik March 9, 2020, 10:07 a.m. UTC | #12
Hi Stefano,

On Tue, 3 Mar 2020, Stefano Brivio wrote:

> On Tue, 3 Mar 2020 10:36:53 +0100 (CET)
> Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> 
> > On Fri, 28 Feb 2020, Stefano Brivio wrote:
> > 
> > > On Thu, 27 Feb 2020 21:37:10 +0100 (CET)
> > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > >   
> > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > >   
> > > > > On Tue, 25 Feb 2020 21:37:45 +0100 (CET)
> > > > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > > >     
> > > > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > > > >     
> > > > > > > > The logic could be changed in the user rules from
> > > > > > > > 
> > > > > > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > > > > > 
> > > > > > > > to
> > > > > > > > 
> > > > > > > > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > > > > > > > [ otherwise DROP ]
> > > > > > > > 
> > > > > > > > but of course it might be not so simple, depending on how the rules are 
> > > > > > > > built up.      
> > > > > > > 
> > > > > > > Yes, it would work, unless the user actually wants to check with the
> > > > > > > same counter how many bytes are sent "in excess".      
> > > > > > 
> > > > > > You mean the counters are still updated whenever the element is matched in 
> > > > > > the set and then one could check how many bytes were sent over the 
> > > > > > threshold just by listing the set elements.    
> > > > > 
> > > > > Yes, exactly -- note that it was possible (and, I think, used) before.    
> > > > 
> > > > I'm still not really convinced about such a feature. Why is it useful to 
> > > > know how many bytes would be sent over the "limit"?  
> > > 
> > > This is useful in case one wants different treatments for packets
> > > according to a number of thresholds in different rules. For example,
> > > 
> > >     iptables -I INPUT -m set --match-set c src --bytes-lt 100 -j noise
> > >     iptables -I noise -m set --match-set c src --bytes-lt 20000 -j download
> > > 
> > > and you want to log packets from chains 'noise' and 'download' with
> > > different prefixes.  
> > 
> > What do you think about this patch?
> 
> Thanks, I think it gives a way to avoid the issue.
> 
> I'm still not convinced that keeping this disabled by default is the 
> best way to go (mostly because we had a kernel change affecting 
> semantics that were exported to userspace for a long time), but if 
> there's a need for the opposite of this option, introducing it as a 
> negation becomes linguistically awkward. :)

The situation is far from ideal: the original mode (update counters 
regardless of the outcome of the counter matches) worked for almost five 
years. Then the 'Fix "don't update counters" mode...' patch changed it so 
that the result of the counter matches was taken into account, for about 
two years. I don't know how many user is expecting either the original or 
the changed behaviour, but better not change it again. Also, the grammar 
seems to be simpler this way :-).
 
> And anyway, it's surely better than not having this possibility at all.
> 
> Let me know if you want me to review (or try to draft) man page
> changes. Just a few comments inline:

Could you write a manpage update to describe properly the features?
 
> > diff --git a/kernel/net/netfilter/ipset/ip_set_core.c b/kernel/net/netfilter/ipset/ip_set_core.c
> > index 1df6536..423d0de 100644
> > --- a/kernel/net/netfilter/ipset/ip_set_core.c
> > +++ b/kernel/net/netfilter/ipset/ip_set_core.c
> > @@ -622,10 +622,9 @@ ip_set_add_packets(u64 packets, struct ip_set_counter *counter)
> >  
> >  static void
> >  ip_set_update_counter(struct ip_set_counter *counter,
> > -		      const struct ip_set_ext *ext, u32 flags)
> > +		      const struct ip_set_ext *ext)
> >  {
> > -	if (ext->packets != ULLONG_MAX &&
> > -	    !(flags & IPSET_FLAG_SKIP_COUNTER_UPDATE)) {
> > +	if (ext->packets != ULLONG_MAX) {
> 
> This means that UPDATE_COUNTERS_FIRST wins over SKIP_COUNTER_UPDATE. Is 
> that intended? Intuitively, I would say that "skip" is more imperative 
> than "do it *first*". Anyway, I guess they will be mutually exclusive.

In my opinion the two flags are mutually exclusive, therefore I dropped
the checking in ip_set_update_counter(). IPSET_FLAG_SKIP_COUNTER_UPDATE is 
taken into account in ip_set_match_extensions() now.

> > -		ip_set_update_counter(counter, ext, flags);
> > +
> > +		if (!(flags & (IPSET_FLAG_UPDATE_COUNTERS_FIRST|
> 
> Nit: whitespace before |.

I'll fix it, thanks!

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary
Phil Sutter April 8, 2020, 4:09 p.m. UTC | #13
Hi!

On Mon, Mar 09, 2020 at 11:07:46AM +0100, Jozsef Kadlecsik wrote:
> On Tue, 3 Mar 2020, Stefano Brivio wrote:
> > On Tue, 3 Mar 2020 10:36:53 +0100 (CET)
> > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > On Fri, 28 Feb 2020, Stefano Brivio wrote:
> > > > On Thu, 27 Feb 2020 21:37:10 +0100 (CET)
> > > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > > > > On Tue, 25 Feb 2020 21:37:45 +0100 (CET)
> > > > > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > > > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > > > > > > > The logic could be changed in the user rules from
> > > > > > > > > 
> > > > > > > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > > > > > > 
> > > > > > > > > to
> > > > > > > > > 
> > > > > > > > > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > > > > > > > > [ otherwise DROP ]
> > > > > > > > > 
> > > > > > > > > but of course it might be not so simple, depending on how the rules are 
> > > > > > > > > built up.      
> > > > > > > > 
> > > > > > > > Yes, it would work, unless the user actually wants to check with the
> > > > > > > > same counter how many bytes are sent "in excess".      
> > > > > > > 
> > > > > > > You mean the counters are still updated whenever the element is matched in 
> > > > > > > the set and then one could check how many bytes were sent over the 
> > > > > > > threshold just by listing the set elements.    
> > > > > > 
> > > > > > Yes, exactly -- note that it was possible (and, I think, used) before.    
> > > > > 
> > > > > I'm still not really convinced about such a feature. Why is it useful to 
> > > > > know how many bytes would be sent over the "limit"?  
> > > > 
> > > > This is useful in case one wants different treatments for packets
> > > > according to a number of thresholds in different rules. For example,
> > > > 
> > > >     iptables -I INPUT -m set --match-set c src --bytes-lt 100 -j noise
> > > >     iptables -I noise -m set --match-set c src --bytes-lt 20000 -j download
> > > > 
> > > > and you want to log packets from chains 'noise' and 'download' with
> > > > different prefixes.  
> > > 
> > > What do you think about this patch?
> > 
> > Thanks, I think it gives a way to avoid the issue.
> > 
> > I'm still not convinced that keeping this disabled by default is the 
> > best way to go (mostly because we had a kernel change affecting 
> > semantics that were exported to userspace for a long time), but if 
> > there's a need for the opposite of this option, introducing it as a 
> > negation becomes linguistically awkward. :)
> 
> The situation is far from ideal: the original mode (update counters 
> regardless of the outcome of the counter matches) worked for almost five 
> years. Then the 'Fix "don't update counters" mode...' patch changed it so 
> that the result of the counter matches was taken into account, for about 
> two years. I don't know how many user is expecting either the original or 
> the changed behaviour, but better not change it again. Also, the grammar 
> seems to be simpler this way :-).

I didn't look at the mentioned fix, but if it really changed counters
that fundamentally, that's a clear sign that nobody uses it, or at least
nobody with a current kernel. :)

Either way, the risk of reverting to the old behaviour is not bigger
than the original divert two years ago and that seems to not have upset
anyone.

Regarding the actual discussed functionality, I second Stefano in that
ipset match and rule match should be regarded as two different things:
ipset counters should count how many packets matched an element in that
ipset, not how many packets matched an iptables rule referring to it.
For the latter question, there are iptables rule counters already.

Cheers, Phil
Jozsef Kadlecsik April 8, 2020, 7:59 p.m. UTC | #14
Hi Phil,

On Wed, 8 Apr 2020, Phil Sutter wrote:

> On Mon, Mar 09, 2020 at 11:07:46AM +0100, Jozsef Kadlecsik wrote:
> > On Tue, 3 Mar 2020, Stefano Brivio wrote:
> > > On Tue, 3 Mar 2020 10:36:53 +0100 (CET)
> > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > > On Fri, 28 Feb 2020, Stefano Brivio wrote:
> > > > > On Thu, 27 Feb 2020 21:37:10 +0100 (CET)
> > > > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > > > > > On Tue, 25 Feb 2020 21:37:45 +0100 (CET)
> > > > > > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > > > > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > > > > > > > > The logic could be changed in the user rules from
> > > > > > > > > > 
> > > > > > > > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > > > > > > > 
> > > > > > > > > > to
> > > > > > > > > > 
> > > > > > > > > > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > > > > > > > > > [ otherwise DROP ]
> > > > > > > > > > 
> > > > > > > > > > but of course it might be not so simple, depending on how the rules are 
> > > > > > > > > > built up.      
> > > > > > > > > 
> > > > > > > > > Yes, it would work, unless the user actually wants to check with the
> > > > > > > > > same counter how many bytes are sent "in excess".      
> > > > > > > > 
> > > > > > > > You mean the counters are still updated whenever the element is matched in 
> > > > > > > > the set and then one could check how many bytes were sent over the 
> > > > > > > > threshold just by listing the set elements.    
> > > > > > > 
> > > > > > > Yes, exactly -- note that it was possible (and, I think, used) before.    
> > > > > > 
> > > > > > I'm still not really convinced about such a feature. Why is it useful to 
> > > > > > know how many bytes would be sent over the "limit"?  
> > > > > 
> > > > > This is useful in case one wants different treatments for packets
> > > > > according to a number of thresholds in different rules. For example,
> > > > > 
> > > > >     iptables -I INPUT -m set --match-set c src --bytes-lt 100 -j noise
> > > > >     iptables -I noise -m set --match-set c src --bytes-lt 20000 -j download
> > > > > 
> > > > > and you want to log packets from chains 'noise' and 'download' with
> > > > > different prefixes.  
> > > > 
> > > > What do you think about this patch?
> > > 
> > > Thanks, I think it gives a way to avoid the issue.
> > > 
> > > I'm still not convinced that keeping this disabled by default is the 
> > > best way to go (mostly because we had a kernel change affecting 
> > > semantics that were exported to userspace for a long time), but if 
> > > there's a need for the opposite of this option, introducing it as a 
> > > negation becomes linguistically awkward. :)
> > 
> > The situation is far from ideal: the original mode (update counters 
> > regardless of the outcome of the counter matches) worked for almost five 
> > years. Then the 'Fix "don't update counters" mode...' patch changed it so 
> > that the result of the counter matches was taken into account, for about 
> > two years. I don't know how many user is expecting either the original or 
> > the changed behaviour, but better not change it again. Also, the grammar 
> > seems to be simpler this way :-).
> 
> I didn't look at the mentioned fix, but if it really changed counters 
> that fundamentally, that's a clear sign that nobody uses it, or at least 
> nobody with a current kernel. :)
> 
> Either way, the risk of reverting to the old behaviour is not bigger 
> than the original divert two years ago and that seems to not have upset 
> anyone.
> 
> Regarding the actual discussed functionality, I second Stefano in that 
> ipset match and rule match should be regarded as two different things: 
> ipset counters should count how many packets matched an element in that 
> ipset, not how many packets matched an iptables rule referring to it. 
> For the latter question, there are iptables rule counters already.

The ipset match cannot check the other parts of the iptables rule.

The question is that in the case of the ipset match, say

	-m set --match-set foo src --packets-lt N

should the ipset elemet counters be updated if the packet matched the set
or if the packet matched the set and the element counter values according 
to the condition as well? What constitutes a "match" in the ipset context 
and how does it refer to the counter updating?

Stefano believes that the former is the natural choice. In my mind the 
second one.

The patch in the ipset git tree makes possible to choose :-)

What is the case with nftables? Is the counter value updated if the 
counter match parts in a rule returns false?

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary
Stefano Brivio April 8, 2020, 8:20 p.m. UTC | #15
On Wed, 8 Apr 2020 21:59:11 +0200 (CEST)
Jozsef Kadlecsik <kadlec@netfilter.org> wrote:

> The patch in the ipset git tree makes possible to choose :-)

József, by the way, let me know what you want to do with the
iptables-extensions man patches I sent. I'm assuming you'd take care
of merging them or re-posting them "at the right time". :)
Jozsef Kadlecsik April 8, 2020, 9:40 p.m. UTC | #16
Hi Stefano,

On Wed, 8 Apr 2020, Stefano Brivio wrote:

> On Wed, 8 Apr 2020 21:59:11 +0200 (CEST)
> Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> 
> > The patch in the ipset git tree makes possible to choose :-)
> 
> József, by the way, let me know what you want to do with the 
> iptables-extensions man patches I sent. I'm assuming you'd take care of 
> merging them or re-posting them "at the right time". :)

It's not forgotten, I'm waiting for a patch which also extends the set 
match and I'd better avoid two new revisions. This is the reason I haven't 
sent yet the patch in the ipset git tree to Pablo.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary
Phil Sutter April 9, 2020, 9:16 a.m. UTC | #17
Hi Jozsef,

On Wed, Apr 08, 2020 at 09:59:11PM +0200, Jozsef Kadlecsik wrote:
> On Wed, 8 Apr 2020, Phil Sutter wrote:
> > On Mon, Mar 09, 2020 at 11:07:46AM +0100, Jozsef Kadlecsik wrote:
> > > On Tue, 3 Mar 2020, Stefano Brivio wrote:
> > > > On Tue, 3 Mar 2020 10:36:53 +0100 (CET)
> > > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > > > On Fri, 28 Feb 2020, Stefano Brivio wrote:
> > > > > > On Thu, 27 Feb 2020 21:37:10 +0100 (CET)
> > > > > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > > > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > > > > > > On Tue, 25 Feb 2020 21:37:45 +0100 (CET)
> > > > > > > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > > > > > > > On Tue, 25 Feb 2020, Stefano Brivio wrote:
> > > > > > > > > > > The logic could be changed in the user rules from
> > > > > > > > > > > 
> > > > > > > > > > > iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
> > > > > > > > > > > 
> > > > > > > > > > > to
> > > > > > > > > > > 
> > > > > > > > > > > iptables -I INPUT -m set --match-set c src --bytes-lt 800 -j ACCEPT
> > > > > > > > > > > [ otherwise DROP ]
> > > > > > > > > > > 
> > > > > > > > > > > but of course it might be not so simple, depending on how the rules are 
> > > > > > > > > > > built up.      
> > > > > > > > > > 
> > > > > > > > > > Yes, it would work, unless the user actually wants to check with the
> > > > > > > > > > same counter how many bytes are sent "in excess".      
> > > > > > > > > 
> > > > > > > > > You mean the counters are still updated whenever the element is matched in 
> > > > > > > > > the set and then one could check how many bytes were sent over the 
> > > > > > > > > threshold just by listing the set elements.    
> > > > > > > > 
> > > > > > > > Yes, exactly -- note that it was possible (and, I think, used) before.    
> > > > > > > 
> > > > > > > I'm still not really convinced about such a feature. Why is it useful to 
> > > > > > > know how many bytes would be sent over the "limit"?  
> > > > > > 
> > > > > > This is useful in case one wants different treatments for packets
> > > > > > according to a number of thresholds in different rules. For example,
> > > > > > 
> > > > > >     iptables -I INPUT -m set --match-set c src --bytes-lt 100 -j noise
> > > > > >     iptables -I noise -m set --match-set c src --bytes-lt 20000 -j download
> > > > > > 
> > > > > > and you want to log packets from chains 'noise' and 'download' with
> > > > > > different prefixes.  
> > > > > 
> > > > > What do you think about this patch?
> > > > 
> > > > Thanks, I think it gives a way to avoid the issue.
> > > > 
> > > > I'm still not convinced that keeping this disabled by default is the 
> > > > best way to go (mostly because we had a kernel change affecting 
> > > > semantics that were exported to userspace for a long time), but if 
> > > > there's a need for the opposite of this option, introducing it as a 
> > > > negation becomes linguistically awkward. :)
> > > 
> > > The situation is far from ideal: the original mode (update counters 
> > > regardless of the outcome of the counter matches) worked for almost five 
> > > years. Then the 'Fix "don't update counters" mode...' patch changed it so 
> > > that the result of the counter matches was taken into account, for about 
> > > two years. I don't know how many user is expecting either the original or 
> > > the changed behaviour, but better not change it again. Also, the grammar 
> > > seems to be simpler this way :-).
> > 
> > I didn't look at the mentioned fix, but if it really changed counters 
> > that fundamentally, that's a clear sign that nobody uses it, or at least 
> > nobody with a current kernel. :)
> > 
> > Either way, the risk of reverting to the old behaviour is not bigger 
> > than the original divert two years ago and that seems to not have upset 
> > anyone.
> > 
> > Regarding the actual discussed functionality, I second Stefano in that 
> > ipset match and rule match should be regarded as two different things: 
> > ipset counters should count how many packets matched an element in that 
> > ipset, not how many packets matched an iptables rule referring to it. 
> > For the latter question, there are iptables rule counters already.
> 
> The ipset match cannot check the other parts of the iptables rule.
> 
> The question is that in the case of the ipset match, say
> 
> 	-m set --match-set foo src --packets-lt N
> 
> should the ipset elemet counters be updated if the packet matched the set
> or if the packet matched the set and the element counter values according 
> to the condition as well? What constitutes a "match" in the ipset context 
> and how does it refer to the counter updating?

From my perspective, the ipset holding elements and associated counters
is a separate entity from the set match in an iptables rule as in your
above example.

I didn't check the code, my intuitive understanding of how that set
match works is by element lookup which returns the element (if found)
with associated counters and the counter comparison happens afterwards.

With the above two assumptions in mind, the question how/when counters
should be updated is quite clear: Since there is no explicit "update
counter" action, the ipset is supposed to update counters whenever it
returns an element upon lookup.

> Stefano believes that the former is the natural choice. In my mind the 
> second one.
> 
> The patch in the ipset git tree makes possible to choose :-)

iptables' recent match is a good example for a fully explicit interface,
but IMHO it is far from intuitive, either. :(

> What is the case with nftables? Is the counter value updated if the 
> counter match parts in a rule returns false?

For quota or limit statements, that's the case. Otherwise they wouldn't
work:

| tcp dport 80 quota over 25MB drop

If the statement wouldn't count if 'over 25MB' condition wasn't true,
the condition could never become true.

Cheers, Phil
diff mbox series

Patch

diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index 69c107f9ba8d..b140e38d9333 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -649,13 +649,14 @@  ip_set_match_extensions(struct ip_set *set, const struct ip_set_ext *ext,
 	if (SET_WITH_COUNTER(set)) {
 		struct ip_set_counter *counter = ext_counter(data, set);
 
+		ip_set_update_counter(counter, ext, flags);
+
 		if (flags & IPSET_FLAG_MATCH_COUNTERS &&
 		    !(ip_set_match_counter(ip_set_get_packets(counter),
 				mext->packets, mext->packets_op) &&
 		      ip_set_match_counter(ip_set_get_bytes(counter),
 				mext->bytes, mext->bytes_op)))
 			return false;
-		ip_set_update_counter(counter, ext, flags);
 	}
 	if (SET_WITH_SKBINFO(set))
 		ip_set_get_skbinfo(ext_skbinfo(data, set),