Patchwork [RFC] Inter-match communication cache

login
register
mail settings
Submitter Jozsef Kadlecsik
Date Sept. 18, 2012, 9:01 p.m.
Message ID <alpine.DEB.2.00.1209182232460.5113@blackhole.kfki.hu>
Download mbox | patch
Permalink /patch/184855/
State Not Applicable
Headers show

Comments

Jozsef Kadlecsik - Sept. 18, 2012, 9:01 p.m.
Hi,

I propose a small cache for inter-match communication purpose:


The cache makes possible to pass data between matches in a rule or in 
different rules in the same table. Currently there's no easy way to 
communicate between matches.

Long story:

The hash:*net* types of sets of ipset support storing "negated" (nomatch) 
entries in a set, which makes possible to build up exceptions.  For 
example if we want to match all IP addresses from 192.168.0.0/16 except 
192.168.0.0/24 and 192.168.16.0/24 as source addresses, then we could use 
the set

ipset new foo hash:net
ipset add foo 192.168.0.0/16
ipset add foo 192.168.0.0/24 nomatch
ipset add foo 192.168.16.0/24 nomatch

and the rule

iptables ... -m set --match-set foo src -j ...

However, actually we face a three-valued decision when matching an IP
address against such sets:

- Can the IP addess be found in the set as a plain element without a mark?
- Can the IP address be found in the set, but marked with "nomatch"?
- Can the IP address be found in the set at all?

We could get the three different values using two evaluations, which
requires the new flag of the set match coming with the next ipset release:

# 1. Match if the IP address is in the set marked with "nomatch" flag
iptables ... -m set --match-set foo src --return-nomatch -j ...
# 2. Match if the IP address is in the set without the "nomatch" flag
iptables ... -m set --match-set foo src -j ...
# 3. Fall through, no match in the set
...

However, that means two full set evaluation, when actually we already know 
the result at the first match: only we are not capable of branching or 
reusing the result. With the proposed patch the set match could store the 
result at 1. in the cache (MATCH flagged with NOMATCH, MATCH, NONE) and 
the second match at 2. above could reuse it, skipping the full evaluation 
of the set.

I pondered a lot on the possible solutions and the cache seemed to be the 
least intrusive and complex. Please review, all comments are highly 
welcomed.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
          H-1525 Budapest 114, POB. 49, Hungary
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesper Dangaard Brouer - Sept. 19, 2012, 3:38 p.m.
On Tue, 18 Sep 2012, Jozsef Kadlecsik wrote:

> I propose a small cache for inter-match communication purpose:
>
> diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
> index 8d674a7..f07eab2 100644
> --- a/include/linux/netfilter/x_tables.h
> +++ b/include/linux/netfilter/x_tables.h
> @@ -216,6 +216,9 @@ struct xt_action_param {
> 		const void *matchinfo, *targinfo;
> 	};
> 	const struct net_device *in, *out;
> +#ifdef CONFIG_NETFILTER_XTABLES_CACHE
> +	u_int32_t cache;
> +#endif

Perhaps we should add it, that the end of the struct, to avoid too big ABI 
breakage.  And I generally don't like, adding compile time optional 
elements in the middle of a struct, as it make its harder to cache profile 
and padding/aligning the struct.

> 	int fragoff;
> 	unsigned int thoff;
> 	unsigned int hooknum;
> @@ -223,6 +226,15 @@ struct xt_action_param {
> 	bool hotdrop;
> };

>
> +enum xt_cache_owner {
> +	XT_CACHE_OWNER_NONE	= 0,
> +	XT_CACHE_OWNER_IPSET	= 1,
> +};
> +
> +#define XT_CACHE_GET_OWNER(cache)	 (((cache) & 0xFF000000) >> 24)
> +#define XT_CACHE_SET_OWNER(cache, owner) ((cache) |= (owner) << 24)
> +#define XT_CACHE_GET_VALUE(cache)	 ((cache) & 0x00FFFFFF)
> +

So, you are reserving 24 bit for data/"values". And we have 8 bits for 
setting an owner of this data.  Thats the basic idea right?


Cheers,
   Jesper Brouer

--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Engelhardt - Sept. 19, 2012, 4:11 p.m.
On Wednesday 2012-09-19 17:38, Jesper Dangaard Brouer wrote:
>> +++ b/include/linux/netfilter/x_tables.h
>> @@ -216,6 +216,9 @@ struct xt_action_param {
>> 		const void *matchinfo, *targinfo;
>> 	};
>> 	const struct net_device *in, *out;
>> +#ifdef CONFIG_NETFILTER_XTABLES_CACHE
>> +	u_int32_t cache;
>> +#endif
>
> Perhaps we should add it, that the end of the struct, to avoid too big ABI
> breakage.

It makes no difference where it is added. The ABI changes either way.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jozsef Kadlecsik - Sept. 19, 2012, 5:25 p.m.
On Wed, 19 Sep 2012, Jesper Dangaard Brouer wrote:

> On Tue, 18 Sep 2012, Jozsef Kadlecsik wrote:
> 
> > I propose a small cache for inter-match communication purpose:
> > 
> > diff --git a/include/linux/netfilter/x_tables.h
> > b/include/linux/netfilter/x_tables.h
> > index 8d674a7..f07eab2 100644
> > --- a/include/linux/netfilter/x_tables.h
> > +++ b/include/linux/netfilter/x_tables.h
> > @@ -216,6 +216,9 @@ struct xt_action_param {
> > 		const void *matchinfo, *targinfo;
> > 	};
> > 	const struct net_device *in, *out;
> > +#ifdef CONFIG_NETFILTER_XTABLES_CACHE
> > +	u_int32_t cache;
> > +#endif
> 
> Perhaps we should add it, that the end of the struct, to avoid too big ABI
> breakage.  And I generally don't like, adding compile time optional elements
> in the middle of a struct, as it make its harder to cache profile and
> padding/aligning the struct.

I did not want to create a hole that's why I put the new structure element 
after *out.

The optional compiling in is just a suggestion :-)
 
> > 	int fragoff;
> > 	unsigned int thoff;
> > 	unsigned int hooknum;
> > @@ -223,6 +226,15 @@ struct xt_action_param {
> > 	bool hotdrop;
> > };
> 
> > 
> > +enum xt_cache_owner {
> > +	XT_CACHE_OWNER_NONE	= 0,
> > +	XT_CACHE_OWNER_IPSET	= 1,
> > +};
> > +
> > +#define XT_CACHE_GET_OWNER(cache)	 (((cache) & 0xFF000000) >> 24)
> > +#define XT_CACHE_SET_OWNER(cache, owner) ((cache) |= (owner) << 24)
> > +#define XT_CACHE_GET_VALUE(cache)	 ((cache) & 0x00FFFFFF)
> > +
> 
> So, you are reserving 24 bit for data/"values". And we have 8 bits for 
> setting an owner of this data.  Thats the basic idea right?

Yes, but we could reserve less bits, say 6, 5 or even 4 for the owner and 
so giving more space to the value. 

Of course the cache is too small to hold a pointer, a match can store 
internal state flags there. For the "set" match that means the set 
identifier (16 bits) and the match flags together with the match result (8 
bits).

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
          H-1525 Budapest 114, POB. 49, Hungary
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira - Sept. 24, 2012, 9:57 a.m.
Hi Jozsef,

On Tue, Sep 18, 2012 at 11:01:34PM +0200, Jozsef Kadlecsik wrote:
> Hi,
> 
> I propose a small cache for inter-match communication purpose:
> 
> diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
> index 8d674a7..f07eab2 100644
> --- a/include/linux/netfilter/x_tables.h
> +++ b/include/linux/netfilter/x_tables.h
> @@ -216,6 +216,9 @@ struct xt_action_param {
>  		const void *matchinfo, *targinfo;
>  	};
>  	const struct net_device *in, *out;
> +#ifdef CONFIG_NETFILTER_XTABLES_CACHE
> +	u_int32_t cache;
> +#endif

I think you can implement this by means of one per-CPU cache inside
the xt_set match.

Check the old per-CPU event cache in net/netfilter/nf_conntrack_ecache.c.
We used to have something similar.

I'd prefer this approach rather than one change in xtables for this.
It still seems to me too specific of your xt_set extensions.

It will remain internal of xt_set match, but we can revisit this later
on to generalize it if it becomes interesting for more matches.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jozsef Kadlecsik - Sept. 24, 2012, 10:25 a.m.
On Mon, 24 Sep 2012, Pablo Neira Ayuso wrote:

> On Tue, Sep 18, 2012 at 11:01:34PM +0200, Jozsef Kadlecsik wrote:
> > 
> > I propose a small cache for inter-match communication purpose:
> > 
> > diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
> > index 8d674a7..f07eab2 100644
> > --- a/include/linux/netfilter/x_tables.h
> > +++ b/include/linux/netfilter/x_tables.h
> > @@ -216,6 +216,9 @@ struct xt_action_param {
> >  		const void *matchinfo, *targinfo;
> >  	};
> >  	const struct net_device *in, *out;
> > +#ifdef CONFIG_NETFILTER_XTABLES_CACHE
> > +	u_int32_t cache;
> > +#endif
> 
> I think you can implement this by means of one per-CPU cache inside
> the xt_set match.
> 
> Check the old per-CPU event cache in net/netfilter/nf_conntrack_ecache.c.
> We used to have something similar.

That's a good idea! I'd prefer something encapsulated inside xt_set as 
well, I'm going to check this possibility.
 
Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
          H-1525 Budapest 114, POB. 49, Hungary
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
index 8d674a7..f07eab2 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -216,6 +216,9 @@  struct xt_action_param {
 		const void *matchinfo, *targinfo;
 	};
 	const struct net_device *in, *out;
+#ifdef CONFIG_NETFILTER_XTABLES_CACHE
+	u_int32_t cache;
+#endif
 	int fragoff;
 	unsigned int thoff;
 	unsigned int hooknum;
@@ -223,6 +226,15 @@  struct xt_action_param {
 	bool hotdrop;
 };
 
+enum xt_cache_owner {
+	XT_CACHE_OWNER_NONE	= 0,
+	XT_CACHE_OWNER_IPSET	= 1,
+};
+
+#define XT_CACHE_GET_OWNER(cache)	 (((cache) & 0xFF000000) >> 24)
+#define XT_CACHE_SET_OWNER(cache, owner) ((cache) |= (owner) << 24)
+#define XT_CACHE_GET_VALUE(cache)	 ((cache) & 0x00FFFFFF)
+
 /**
  * struct xt_mtchk_param - parameters for match extensions'
  * checkentry functions