[v3] netfilter : add NAT support for shifted portmap ranges

Message ID 7503dae4-e597-c3ca-1671-a24809be4ed0@dtsystems.be
State Under Review
Delegated to: Pablo Neira
Headers show
Series
  • [v3] netfilter : add NAT support for shifted portmap ranges
Related show

Commit Message

Thierry Du Tre Jan. 12, 2018, 2:01 p.m.
This is a patch proposal to support shifted ranges in portmaps.
(i.e. tcp/udp incoming port 5000-5100 on WAN redirected to LAN 192.168.1.5:2000-2100)

Currently DNAT only works for single port or identical port ranges.
(i.e. ports 5000-5100 on WAN interface redirected to a LAN host while original destination port is not altered)
When different port ranges are configured, either 'random' mode should be used, or else all incoming connections are mapped onto the first port in the redirect range. (in described example WAN:5000-5100 will all be mapped to 192.168.1.5:2000)

This patch introduces a new mode indicated by flag NF_NAT_RANGE_PROTO_OFFSET which uses a base port value to calculate an offset with the destination port present in the incoming stream. That offset is then applied as index within the redirect port range (index modulo rangewidth to handle range overflow).

In described example the base port would be 5000. An incoming stream with destination port 5004 would result in an offset value 4 which means that the NAT'ed stream will be using destination port 2004.

Other possibilities include deterministic mapping of larger or multiple ranges to a smaller range : WAN:5000-5999 -> LAN:5000-5099 (maps WAN port 5*xx to port 51xx)

This patch does not change any current behavior. It just adds new NAT proto range functionality which must be selected via the specific flag when intended to use.

A patch for iptables (libipt_DNAT.c) will also be proposed which makes this functionality immediately available.

Signed-off-by: Thierry Du Tre <thierry@dtsystems.be>

---
Changes in v3:
    - use nf_nat_range as name for updated struct, renamed existing nf_nat_range to nf_nat_range1
    - reverted all nf_nat_range2 occurences

Changes in v2:
    - added new revision for SNAT and DNAT targets to support the new base port variable in struct nf_nat_range2
    - replaced all occurences of struct nf_nat_range with struct nf_nat_range2

 include/uapi/linux/netfilter/nf_nat.h | 12 +++++++-
 net/netfilter/nf_nat_core.c           |  9 +++---
 net/netfilter/nf_nat_proto_common.c   |  5 ++-
 net/netfilter/xt_nat.c                | 58 ++++++++++++++++++++++++++++++++++-
 4 files changed, 77 insertions(+), 7 deletions(-)

Comments

Thierry Du Tre Jan. 15, 2018, 12:56 p.m. | #1
Hi Pablo,

I prepared this third version to get aligned about the way forward for the extension for struct nf_nat_range.

Renaming the old definition as you suggested indeed results in a much smaller patch for netfilter kernel part.
However, doing it like this also means that userspace code will require changes to cope with the new value for sizeof(struct nf_nat_range).

i.e. iptables-1.6.1 :

./extensions/libip6t_SNAT.c:306:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
./extensions/libip6t_DNAT.c:290:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
./extensions/libip6t_NETMAP.c:89:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
./extensions/libip6t_MASQUERADE.c:159:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
./extensions/libip6t_REDIRECT.c:158:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),

As far as I understand, all these xt target modules will have to increment their revision which makes them incompatible with current kernel versions.
The other option is to replace all occurences of nf_nat_range with nf_nat_range1 in these userspace libraries.
That would solve iptables but possible other applications might also be impacted ?

Somehow this doesn't seem right to me, so I might have misinterpreted your earlier response.


On 12-01-18 15:01, Thierry Du Tre wrote:
> This is a patch proposal to support shifted ranges in portmaps.
> (i.e. tcp/udp incoming port 5000-5100 on WAN redirected to LAN 192.168.1.5:2000-2100)
>
> Currently DNAT only works for single port or identical port ranges.
> (i.e. ports 5000-5100 on WAN interface redirected to a LAN host while original destination port is not altered)
> When different port ranges are configured, either 'random' mode should be used, or else all incoming connections are mapped onto the first port in the redirect range. (in described example WAN:5000-5100 will all be mapped to 192.168.1.5:2000)
>
> This patch introduces a new mode indicated by flag NF_NAT_RANGE_PROTO_OFFSET which uses a base port value to calculate an offset with the destination port present in the incoming stream. That offset is then applied as index within the redirect port range (index modulo rangewidth to handle range overflow).
>
> In described example the base port would be 5000. An incoming stream with destination port 5004 would result in an offset value 4 which means that the NAT'ed stream will be using destination port 2004.
>
> Other possibilities include deterministic mapping of larger or multiple ranges to a smaller range : WAN:5000-5999 -> LAN:5000-5099 (maps WAN port 5*xx to port 51xx)
>
> This patch does not change any current behavior. It just adds new NAT proto range functionality which must be selected via the specific flag when intended to use.
>
> A patch for iptables (libipt_DNAT.c) will also be proposed which makes this functionality immediately available.
>
> Signed-off-by: Thierry Du Tre <thierry@dtsystems.be>
>
> ---
> Changes in v3:
>     - use nf_nat_range as name for updated struct, renamed existing nf_nat_range to nf_nat_range1
>     - reverted all nf_nat_range2 occurences
>
> Changes in v2:
>     - added new revision for SNAT and DNAT targets to support the new base port variable in struct nf_nat_range2
>     - replaced all occurences of struct nf_nat_range with struct nf_nat_range2
>
>  include/uapi/linux/netfilter/nf_nat.h | 12 +++++++-
>  net/netfilter/nf_nat_core.c           |  9 +++---
>  net/netfilter/nf_nat_proto_common.c   |  5 ++-
>  net/netfilter/xt_nat.c                | 58 ++++++++++++++++++++++++++++++++++-
>  4 files changed, 77 insertions(+), 7 deletions(-)
>
> diff --git a/include/uapi/linux/netfilter/nf_nat.h b/include/uapi/linux/netfilter/nf_nat.h
> index a33000d..0fa792a 100644
> --- a/include/uapi/linux/netfilter/nf_nat.h
> +++ b/include/uapi/linux/netfilter/nf_nat.h
> @@ -10,6 +10,7 @@
>  #define NF_NAT_RANGE_PROTO_RANDOM		(1 << 2)
>  #define NF_NAT_RANGE_PERSISTENT			(1 << 3)
>  #define NF_NAT_RANGE_PROTO_RANDOM_FULLY		(1 << 4)
> +#define NF_NAT_RANGE_PROTO_OFFSET		(1 << 5)
>  
>  #define NF_NAT_RANGE_PROTO_RANDOM_ALL		\
>  	(NF_NAT_RANGE_PROTO_RANDOM | NF_NAT_RANGE_PROTO_RANDOM_FULLY)
> @@ -17,7 +18,7 @@
>  #define NF_NAT_RANGE_MASK					\
>  	(NF_NAT_RANGE_MAP_IPS | NF_NAT_RANGE_PROTO_SPECIFIED |	\
>  	 NF_NAT_RANGE_PROTO_RANDOM | NF_NAT_RANGE_PERSISTENT |	\
> -	 NF_NAT_RANGE_PROTO_RANDOM_FULLY)
> +	 NF_NAT_RANGE_PROTO_RANDOM_FULLY | NF_NAT_RANGE_PROTO_OFFSET)
>  
>  struct nf_nat_ipv4_range {
>  	unsigned int			flags;
> @@ -32,12 +33,21 @@ struct nf_nat_ipv4_multi_range_compat {
>  	struct nf_nat_ipv4_range	range[1];
>  };
>  
> +struct nf_nat_range1 {
> +	unsigned int			flags;
> +	union nf_inet_addr		min_addr;
> +	union nf_inet_addr		max_addr;
> +	union nf_conntrack_man_proto	min_proto;
> +	union nf_conntrack_man_proto	max_proto;
> +};
> +
>  struct nf_nat_range {
>  	unsigned int			flags;
>  	union nf_inet_addr		min_addr;
>  	union nf_inet_addr		max_addr;
>  	union nf_conntrack_man_proto	min_proto;
>  	union nf_conntrack_man_proto	max_proto;
> +	union nf_conntrack_man_proto	base_proto;
>  };
>  
>  #endif /* _NETFILTER_NF_NAT_H */
> diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
> index af8345f..de5c327 100644
> --- a/net/netfilter/nf_nat_core.c
> +++ b/net/netfilter/nf_nat_core.c
> @@ -347,9 +347,10 @@ get_unique_tuple(struct nf_conntrack_tuple *tuple,
>  	/* Only bother mapping if it's not already in range and unique */
>  	if (!(range->flags & NF_NAT_RANGE_PROTO_RANDOM_ALL)) {
>  		if (range->flags & NF_NAT_RANGE_PROTO_SPECIFIED) {
> -			if (l4proto->in_range(tuple, maniptype,
> -					      &range->min_proto,
> -					      &range->max_proto) &&
> +			if (!(range->flags & NF_NAT_RANGE_PROTO_OFFSET) &&
> +			    l4proto->in_range(tuple, maniptype,
> +			          &range->min_proto,
> +			          &range->max_proto) &&
>  			    (range->min_proto.all == range->max_proto.all ||
>  			     !nf_nat_used_tuple(tuple, ct)))
>  				goto out;
> @@ -358,7 +359,7 @@ get_unique_tuple(struct nf_conntrack_tuple *tuple,
>  		}
>  	}
>  
> -	/* Last change: get protocol to try to obtain unique tuple. */
> +	/* Last chance: get protocol to try to obtain unique tuple. */
>  	l4proto->unique_tuple(l3proto, tuple, range, maniptype, ct);
>  out:
>  	rcu_read_unlock();
> diff --git a/net/netfilter/nf_nat_proto_common.c b/net/netfilter/nf_nat_proto_common.c
> index fbce552..2c30eca 100644
> --- a/net/netfilter/nf_nat_proto_common.c
> +++ b/net/netfilter/nf_nat_proto_common.c
> @@ -80,6 +80,8 @@ void nf_nat_l4proto_unique_tuple(const struct nf_nat_l3proto *l3proto,
>  						  : tuple->src.u.all);
>  	} else if (range->flags & NF_NAT_RANGE_PROTO_RANDOM_FULLY) {
>  		off = prandom_u32();
> +	} else if (range->flags & NF_NAT_RANGE_PROTO_OFFSET) {
> +		off = (ntohs(*portptr) - ntohs(range->base_proto.all));
>  	} else {
>  		off = *rover;
>  	}
> @@ -88,7 +90,8 @@ void nf_nat_l4proto_unique_tuple(const struct nf_nat_l3proto *l3proto,
>  		*portptr = htons(min + off % range_size);
>  		if (++i != range_size && nf_nat_used_tuple(tuple, ct))
>  			continue;
> -		if (!(range->flags & NF_NAT_RANGE_PROTO_RANDOM_ALL))
> +		if (!(range->flags & (NF_NAT_RANGE_PROTO_RANDOM_ALL|
> +					NF_NAT_RANGE_PROTO_OFFSET)))
>  			*rover = off;
>  		return;
>  	}
> diff --git a/net/netfilter/xt_nat.c b/net/netfilter/xt_nat.c
> index 0fd14d1..7e8552b 100644
> --- a/net/netfilter/xt_nat.c
> +++ b/net/netfilter/xt_nat.c
> @@ -41,6 +41,7 @@ static void xt_nat_convert_range(struct nf_nat_range *dst,
>  {
>  	memset(&dst->min_addr, 0, sizeof(dst->min_addr));
>  	memset(&dst->max_addr, 0, sizeof(dst->max_addr));
> +	memset(&dst->base_proto, 0, sizeof(dst->base_proto));
>  
>  	dst->flags	 = src->flags;
>  	dst->min_addr.ip = src->min_ip;
> @@ -85,6 +86,39 @@ xt_dnat_target_v0(struct sk_buff *skb, const struct xt_action_param *par)
>  static unsigned int
>  xt_snat_target_v1(struct sk_buff *skb, const struct xt_action_param *par)
>  {
> +	const struct nf_nat_range1 *range_v1 = par->targinfo;
> +	struct nf_nat_range range = {};
> +	enum ip_conntrack_info ctinfo;
> +	struct nf_conn *ct;
> +
> +	ct = nf_ct_get(skb, &ctinfo);
> +	WARN_ON(!(ct != NULL &&
> +		 (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED ||
> +		  ctinfo == IP_CT_RELATED_REPLY)));
> +
> +	memcpy(&range, range_v1, sizeof(*range_v1));
> +	return nf_nat_setup_info(ct, &range, NF_NAT_MANIP_SRC);
> +}
> +
> +static unsigned int
> +xt_dnat_target_v1(struct sk_buff *skb, const struct xt_action_param *par)
> +{
> +	const struct nf_nat_range1 *range_v1 = par->targinfo;
> +	struct nf_nat_range range = {};
> +	enum ip_conntrack_info ctinfo;
> +	struct nf_conn *ct;
> +
> +	ct = nf_ct_get(skb, &ctinfo);
> +	WARN_ON(!(ct != NULL &&
> +		 (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED)));
> +
> +	memcpy(&range, range_v1, sizeof(*range_v1));
> +	return nf_nat_setup_info(ct, &range, NF_NAT_MANIP_DST);
> +}
> +
> +static unsigned int
> +xt_snat_target_v2(struct sk_buff *skb, const struct xt_action_param *par)
> +{
>  	const struct nf_nat_range *range = par->targinfo;
>  	enum ip_conntrack_info ctinfo;
>  	struct nf_conn *ct;
> @@ -98,7 +132,7 @@ xt_snat_target_v1(struct sk_buff *skb, const struct xt_action_param *par)
>  }
>  
>  static unsigned int
> -xt_dnat_target_v1(struct sk_buff *skb, const struct xt_action_param *par)
> +xt_dnat_target_v2(struct sk_buff *skb, const struct xt_action_param *par)
>  {
>  	const struct nf_nat_range *range = par->targinfo;
>  	enum ip_conntrack_info ctinfo;
> @@ -162,6 +196,28 @@ static struct xt_target xt_nat_target_reg[] __read_mostly = {
>  				  (1 << NF_INET_LOCAL_OUT),
>  		.me		= THIS_MODULE,
>  	},
> +	{
> +		.name		= "SNAT",
> +		.revision	= 2,
> +		.checkentry	= xt_nat_checkentry,
> +		.destroy	= xt_nat_destroy,
> +		.target		= xt_snat_target_v2,
> +		.targetsize	= sizeof(struct nf_nat_range),
> +		.table		= "nat",
> +		.hooks		= (1 << NF_INET_POST_ROUTING) |
> +				  (1 << NF_INET_LOCAL_IN),
> +		.me		= THIS_MODULE,
> +	},
> +	{
> +		.name		= "DNAT",
> +		.revision	= 2,
> +		.target		= xt_dnat_target_v2,
> +		.targetsize	= sizeof(struct nf_nat_range),
> +		.table		= "nat",
> +		.hooks		= (1 << NF_INET_PRE_ROUTING) |
> +				  (1 << NF_INET_LOCAL_OUT),
> +		.me		= THIS_MODULE,
> +	},
>  };
>  
>  static int __init xt_nat_init(void)

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Jan. 16, 2018, 2:32 p.m. | #2
Hi Thierry,

On Mon, Jan 15, 2018 at 01:56:09PM +0100, Thierry Du Tre wrote:
> Hi Pablo,
> 
> I prepared this third version to get aligned about the way forward for the extension for struct nf_nat_range.
> 
> Renaming the old definition as you suggested indeed results in a much smaller patch for netfilter kernel part.
> However, doing it like this also means that userspace code will require changes to cope with the new value for sizeof(struct nf_nat_range).
> 
> i.e. iptables-1.6.1 :
> 
> ./extensions/libip6t_SNAT.c:306:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
> ./extensions/libip6t_DNAT.c:290:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
> ./extensions/libip6t_NETMAP.c:89:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
> ./extensions/libip6t_MASQUERADE.c:159:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
> ./extensions/libip6t_REDIRECT.c:158:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
> 
> As far as I understand, all these xt target modules will have to increment their revision which makes them incompatible with current kernel versions.
> The other option is to replace all occurences of nf_nat_range with nf_nat_range1 in these userspace libraries.
> That would solve iptables but possible other applications might also be impacted ?
> 
> Somehow this doesn't seem right to me, so I might have misinterpreted your earlier response.

I guess you need to add new revisions for the userspace code too,
right? Am I missing anything?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Du Tre Jan. 16, 2018, 9:26 p.m. | #3
Op 16/01/2018 om 15:32 schreef Pablo Neira Ayuso:
> Hi Thierry,
> 
> On Mon, Jan 15, 2018 at 01:56:09PM +0100, Thierry Du Tre wrote:
>> Hi Pablo,
>>
>> I prepared this third version to get aligned about the way forward for the extension for struct nf_nat_range.
>>
>> Renaming the old definition as you suggested indeed results in a much smaller patch for netfilter kernel part.
>> However, doing it like this also means that userspace code will require changes to cope with the new value for sizeof(struct nf_nat_range).
>>
>> i.e. iptables-1.6.1 :
>>
>> ./extensions/libip6t_SNAT.c:306:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
>> ./extensions/libip6t_DNAT.c:290:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
>> ./extensions/libip6t_NETMAP.c:89:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
>> ./extensions/libip6t_MASQUERADE.c:159:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
>> ./extensions/libip6t_REDIRECT.c:158:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
>>
>> As far as I understand, all these xt target modules will have to increment their revision which makes them incompatible with current kernel versions.
>> The other option is to replace all occurences of nf_nat_range with nf_nat_range1 in these userspace libraries.
>> That would solve iptables but possible other applications might also be impacted ?
>>
>> Somehow this doesn't seem right to me, so I might have misinterpreted your earlier response.
> 
> I guess you need to add new revisions for the userspace code too,
> right? Am I missing anything?
 
I would sure add a new revision for SNAT and DNAT. In theory I could do that for the others too, although that won't enable any new functionality there.

The problem is about the renaming of the current nf_nat_range to nf_nat_range_old, and using nf_nat_range to name the new extended struct :

+struct nf_nat_range_old {
+       unsigned int                    flags;
+       union nf_inet_addr              min_addr;
+       union nf_inet_addr              max_addr;
+       union nf_conntrack_man_proto    min_proto;
+       union nf_conntrack_man_proto    max_proto;
+};
+
 struct nf_nat_range {
        unsigned int                    flags;
        union nf_inet_addr              min_addr;
        union nf_inet_addr              max_addr;
        union nf_conntrack_man_proto    min_proto;
        union nf_conntrack_man_proto    max_proto;
+       union nf_conntrack_man_proto    base_proto;
 };

In order to keep iptables extensions backwards compatible with current and previous kernel versions -which don't have the new revision-, we need to replace all nf_nat_range occurences in those with nf_nat_range_old.
Otherwise, when compiling with the new extended definition for nf_nat_range, the xt_target structures with current revision will use different sizes for the datablob sent to kernel.
i.e. libip6t_REDIRECT.c :
static struct xtables_target redirect_tg_reg = {
	.name		= "REDIRECT",
	.version	= XTABLES_VERSION,
	.family		= NFPROTO_IPV6,
	.size		= XT_ALIGN(sizeof(struct nf_nat_range)),
	.userspacesize	= XT_ALIGN(sizeof(struct nf_nat_range)),
...
}
Values for .size and .userspacesize will increase which makes the blob sent to kernel larger than netfilter REDIRECT (xt_target revision=0) currently expects.

I hope this explains my concern correctly. My apologies if not..
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Du Tre Jan. 17, 2018, 6:31 p.m. | #4
Op 16/01/2018 om 15:32 schreef Pablo Neira Ayuso:
> Hi Thierry,
> 
> On Mon, Jan 15, 2018 at 01:56:09PM +0100, Thierry Du Tre wrote:
>> Hi Pablo,
>>
>> I prepared this third version to get aligned about the way forward for the extension for struct nf_nat_range.
>>
>> Renaming the old definition as you suggested indeed results in a much smaller patch for netfilter kernel part.
>> However, doing it like this also means that userspace code will require changes to cope with the new value for sizeof(struct nf_nat_range).
>>
>> i.e. iptables-1.6.1 :
>>
>> ./extensions/libip6t_SNAT.c:306:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
>> ./extensions/libip6t_DNAT.c:290:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
>> ./extensions/libip6t_NETMAP.c:89:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
>> ./extensions/libip6t_MASQUERADE.c:159:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
>> ./extensions/libip6t_REDIRECT.c:158:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
>>
>> As far as I understand, all these xt target modules will have to increment their revision which makes them incompatible with current kernel versions.
>> The other option is to replace all occurences of nf_nat_range with nf_nat_range1 in these userspace libraries.
>> That would solve iptables but possible other applications might also be impacted ?
>>
>> Somehow this doesn't seem right to me, so I might have misinterpreted your earlier response.
> 
> I guess you need to add new revisions for the userspace code too,
> right? Am I missing anything?

I attached an example change for libipt6t_SNAT in iptables. This illustrates how renaming current nf_nat_range should be handled in userspace code.
If we would just increment the revision number in snat_tg_reg, then a new kernel with support for revision 2 would be required for creating SNAT rules.
So for compatibility with existing kernels, we need to keep current SNAT revision in but adapt to the renamed nf_nat_range datastruct.

---
 extensions/libip6t_SNAT.c        | 20 ++++++++++----------
 include/linux/netfilter/nf_nat.h | 19 +++++++++++++++++--
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/extensions/libip6t_SNAT.c b/extensions/libip6t_SNAT.c
index 7d74b3d..3eb0319 100644
--- a/extensions/libip6t_SNAT.c
+++ b/extensions/libip6t_SNAT.c
@@ -47,7 +47,7 @@ static const struct xt_option_entry SNAT_opts[] = {
 
 /* Ranges expected in network order. */
 static void
-parse_to(const char *orig_arg, int portok, struct nf_nat_range *range)
+parse_to(const char *orig_arg, int portok, nf_nat_range1 *range)
 {
 	char *arg, *start, *end = NULL, *colon = NULL, *dash, *error;
 	const struct in6_addr *ip;
@@ -150,7 +150,7 @@ parse_to(const char *orig_arg, int portok, struct nf_nat_range *range)
 static void SNAT_parse(struct xt_option_call *cb)
 {
 	const struct ip6t_entry *entry = cb->xt_entry;
-	struct nf_nat_range *range = cb->data;
+	nf_nat_range1 *range = cb->data;
 	int portok;
 
 	if (entry->ipv6.proto == IPPROTO_TCP ||
@@ -182,7 +182,7 @@ static void SNAT_fcheck(struct xt_fcheck_call *cb)
 {
 	static const unsigned int f = F_TO_SRC | F_RANDOM;
 	static const unsigned int r = F_TO_SRC | F_RANDOM_FULLY;
-	struct nf_nat_range *range = cb->data;
+	nf_nat_range1 *range = cb->data;
 
 	if ((cb->xflags & f) == f)
 		range->flags |= NF_NAT_RANGE_PROTO_RANDOM;
@@ -190,7 +190,7 @@ static void SNAT_fcheck(struct xt_fcheck_call *cb)
 		range->flags |= NF_NAT_RANGE_PROTO_RANDOM_FULLY;
 }
 
-static void print_range(const struct nf_nat_range *range)
+static void print_range(const nf_nat_range1 *range)
 {
 	if (range->flags & NF_NAT_RANGE_MAP_IPS) {
 		if (range->flags & NF_NAT_RANGE_PROTO_SPECIFIED)
@@ -213,7 +213,7 @@ static void print_range(const struct nf_nat_range *range)
 static void SNAT_print(const void *ip, const struct xt_entry_target *target,
                        int numeric)
 {
-	const struct nf_nat_range *range = (const void *)target->data;
+	const nf_nat_range1 *range = (const void *)target->data;
 
 	printf(" to:");
 	print_range(range);
@@ -227,7 +227,7 @@ static void SNAT_print(const void *ip, const struct xt_entry_target *target,
 
 static void SNAT_save(const void *ip, const struct xt_entry_target *target)
 {
-	const struct nf_nat_range *range = (const void *)target->data;
+	const nf_nat_range1 *range = (const void *)target->data;
 
 	printf(" --to-source ");
 	print_range(range);
@@ -239,7 +239,7 @@ static void SNAT_save(const void *ip, const struct xt_entry_target *target)
 		printf(" --persistent");
 }
 
-static void print_range_xlate(const struct nf_nat_range *range,
+static void print_range_xlate(const nf_nat_range1 *range,
 			      struct xt_xlate *xl)
 {
 	bool proto_specified = range->flags & NF_NAT_RANGE_PROTO_SPECIFIED;
@@ -270,7 +270,7 @@ static void print_range_xlate(const struct nf_nat_range *range,
 static int SNAT_xlate(struct xt_xlate *xl,
 		      const struct xt_xlate_tg_params *params)
 {
-	const struct nf_nat_range *range = (const void *)params->target->data;
+	const nf_nat_range1 *range = (const void *)params->target->data;
 	bool sep_need = false;
 	const char *sep = " ";
 
@@ -300,8 +300,8 @@ static struct xtables_target snat_tg_reg = {
 	.version	= XTABLES_VERSION,
 	.family		= NFPROTO_IPV6,
 	.revision	= 1,
-	.size		= XT_ALIGN(sizeof(struct nf_nat_range)),
-	.userspacesize	= XT_ALIGN(sizeof(struct nf_nat_range)),
+	.size		= XT_ALIGN(sizeof(nf_nat_range1)),
+	.userspacesize	= XT_ALIGN(sizeof(nf_nat_range1)),
 	.help		= SNAT_help,
 	.x6_parse	= SNAT_parse,
 	.x6_fcheck	= SNAT_fcheck,
diff --git a/include/linux/netfilter/nf_nat.h b/include/linux/netfilter/nf_nat.h
index 1ad3659..3d1c1e7 100644
--- a/include/linux/netfilter/nf_nat.h
+++ b/include/linux/netfilter/nf_nat.h
@@ -9,10 +9,16 @@
 #define NF_NAT_RANGE_PROTO_RANDOM		(1 << 2)
 #define NF_NAT_RANGE_PERSISTENT			(1 << 3)
 #define NF_NAT_RANGE_PROTO_RANDOM_FULLY		(1 << 4)
+#define NF_NAT_RANGE_PROTO_OFFSET		(1 << 5)
 
 #define NF_NAT_RANGE_PROTO_RANDOM_ALL		\
 	(NF_NAT_RANGE_PROTO_RANDOM | NF_NAT_RANGE_PROTO_RANDOM_FULLY)
 
+#define NF_NAT_RANGE_MASK					\
+	(NF_NAT_RANGE_MAP_IPS | NF_NAT_RANGE_PROTO_SPECIFIED |	\
+	 NF_NAT_RANGE_PROTO_RANDOM | NF_NAT_RANGE_PERSISTENT |	\
+	 NF_NAT_RANGE_PROTO_RANDOM_FULLY | NF_NAT_RANGE_PROTO_OFFSET)
+
 struct nf_nat_ipv4_range {
 	unsigned int			flags;
 	__be32				min_ip;
@@ -26,12 +32,21 @@ struct nf_nat_ipv4_multi_range_compat {
 	struct nf_nat_ipv4_range	range[1];
 };
 
-struct nf_nat_range {
+typedef struct {
 	unsigned int			flags;
 	union nf_inet_addr		min_addr;
 	union nf_inet_addr		max_addr;
 	union nf_conntrack_man_proto	min_proto;
 	union nf_conntrack_man_proto	max_proto;
-};
+} nf_nat_range1;
+
+typedef struct nf_nat_range {
+	unsigned int			flags;
+	union nf_inet_addr		min_addr;
+	union nf_inet_addr		max_addr;
+	union nf_conntrack_man_proto	min_proto;
+	union nf_conntrack_man_proto	max_proto;
+	union nf_conntrack_man_proto	base_proto;
+} nf_nat_range2;
 
 #endif /* _NETFILTER_NF_NAT_H */
Pablo Neira Ayuso Jan. 17, 2018, 6:48 p.m. | #5
On Wed, Jan 17, 2018 at 07:31:33PM +0100, Thierry Du Tre wrote:
> Op 16/01/2018 om 15:32 schreef Pablo Neira Ayuso:
> > Hi Thierry,
> > 
> > On Mon, Jan 15, 2018 at 01:56:09PM +0100, Thierry Du Tre wrote:
> >> Hi Pablo,
> >>
> >> I prepared this third version to get aligned about the way forward for the extension for struct nf_nat_range.
> >>
> >> Renaming the old definition as you suggested indeed results in a much smaller patch for netfilter kernel part.
> >> However, doing it like this also means that userspace code will require changes to cope with the new value for sizeof(struct nf_nat_range).
> >>
> >> i.e. iptables-1.6.1 :
> >>
> >> ./extensions/libip6t_SNAT.c:306:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
> >> ./extensions/libip6t_DNAT.c:290:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
> >> ./extensions/libip6t_NETMAP.c:89:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
> >> ./extensions/libip6t_MASQUERADE.c:159:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
> >> ./extensions/libip6t_REDIRECT.c:158:    .userspacesize    = XT_ALIGN(sizeof(struct nf_nat_range)),
> >>
> >> As far as I understand, all these xt target modules will have to increment their revision which makes them incompatible with current kernel versions.
> >> The other option is to replace all occurences of nf_nat_range with nf_nat_range1 in these userspace libraries.
> >> That would solve iptables but possible other applications might also be impacted ?
> >>
> >> Somehow this doesn't seem right to me, so I might have misinterpreted your earlier response.
> > 
> > I guess you need to add new revisions for the userspace code too,
> > right? Am I missing anything?
> 
> I attached an example change for libipt6t_SNAT in iptables. This
> illustrates how renaming current nf_nat_range should be handled in
> userspace code.  If we would just increment the revision number in
> snat_tg_reg, then a new kernel with support for revision 2 would be
> required for creating SNAT rules.  So for compatibility with
> existing kernels, we need to keep current SNAT revision in but adapt
> to the renamed nf_nat_range datastruct.

Renaming in userspace is fine. If that's the concern, there is no
problem if this helps us keep the kernel patch small.

> ---
>  extensions/libip6t_SNAT.c        | 20 ++++++++++----------
>  include/linux/netfilter/nf_nat.h | 19 +++++++++++++++++--
>  2 files changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/extensions/libip6t_SNAT.c b/extensions/libip6t_SNAT.c
> index 7d74b3d..3eb0319 100644
> --- a/extensions/libip6t_SNAT.c
> +++ b/extensions/libip6t_SNAT.c
> @@ -47,7 +47,7 @@ static const struct xt_option_entry SNAT_opts[] = {
>  
>  /* Ranges expected in network order. */
>  static void
> -parse_to(const char *orig_arg, int portok, struct nf_nat_range *range)
> +parse_to(const char *orig_arg, int portok, nf_nat_range1 *range)
>  {
>  	char *arg, *start, *end = NULL, *colon = NULL, *dash, *error;
>  	const struct in6_addr *ip;
> @@ -150,7 +150,7 @@ parse_to(const char *orig_arg, int portok, struct nf_nat_range *range)
>  static void SNAT_parse(struct xt_option_call *cb)
>  {
>  	const struct ip6t_entry *entry = cb->xt_entry;
> -	struct nf_nat_range *range = cb->data;
> +	nf_nat_range1 *range = cb->data;
>  	int portok;
>  
>  	if (entry->ipv6.proto == IPPROTO_TCP ||
> @@ -182,7 +182,7 @@ static void SNAT_fcheck(struct xt_fcheck_call *cb)
>  {
>  	static const unsigned int f = F_TO_SRC | F_RANDOM;
>  	static const unsigned int r = F_TO_SRC | F_RANDOM_FULLY;
> -	struct nf_nat_range *range = cb->data;
> +	nf_nat_range1 *range = cb->data;
>  
>  	if ((cb->xflags & f) == f)
>  		range->flags |= NF_NAT_RANGE_PROTO_RANDOM;
> @@ -190,7 +190,7 @@ static void SNAT_fcheck(struct xt_fcheck_call *cb)
>  		range->flags |= NF_NAT_RANGE_PROTO_RANDOM_FULLY;
>  }
>  
> -static void print_range(const struct nf_nat_range *range)
> +static void print_range(const nf_nat_range1 *range)
>  {
>  	if (range->flags & NF_NAT_RANGE_MAP_IPS) {
>  		if (range->flags & NF_NAT_RANGE_PROTO_SPECIFIED)
> @@ -213,7 +213,7 @@ static void print_range(const struct nf_nat_range *range)
>  static void SNAT_print(const void *ip, const struct xt_entry_target *target,
>                         int numeric)
>  {
> -	const struct nf_nat_range *range = (const void *)target->data;
> +	const nf_nat_range1 *range = (const void *)target->data;
>  
>  	printf(" to:");
>  	print_range(range);
> @@ -227,7 +227,7 @@ static void SNAT_print(const void *ip, const struct xt_entry_target *target,
>  
>  static void SNAT_save(const void *ip, const struct xt_entry_target *target)
>  {
> -	const struct nf_nat_range *range = (const void *)target->data;
> +	const nf_nat_range1 *range = (const void *)target->data;
>  
>  	printf(" --to-source ");
>  	print_range(range);
> @@ -239,7 +239,7 @@ static void SNAT_save(const void *ip, const struct xt_entry_target *target)
>  		printf(" --persistent");
>  }
>  
> -static void print_range_xlate(const struct nf_nat_range *range,
> +static void print_range_xlate(const nf_nat_range1 *range,
>  			      struct xt_xlate *xl)
>  {
>  	bool proto_specified = range->flags & NF_NAT_RANGE_PROTO_SPECIFIED;
> @@ -270,7 +270,7 @@ static void print_range_xlate(const struct nf_nat_range *range,
>  static int SNAT_xlate(struct xt_xlate *xl,
>  		      const struct xt_xlate_tg_params *params)
>  {
> -	const struct nf_nat_range *range = (const void *)params->target->data;
> +	const nf_nat_range1 *range = (const void *)params->target->data;
>  	bool sep_need = false;
>  	const char *sep = " ";
>  
> @@ -300,8 +300,8 @@ static struct xtables_target snat_tg_reg = {
>  	.version	= XTABLES_VERSION,
>  	.family		= NFPROTO_IPV6,
>  	.revision	= 1,
> -	.size		= XT_ALIGN(sizeof(struct nf_nat_range)),
> -	.userspacesize	= XT_ALIGN(sizeof(struct nf_nat_range)),
> +	.size		= XT_ALIGN(sizeof(nf_nat_range1)),
> +	.userspacesize	= XT_ALIGN(sizeof(nf_nat_range1)),
>  	.help		= SNAT_help,
>  	.x6_parse	= SNAT_parse,
>  	.x6_fcheck	= SNAT_fcheck,
> diff --git a/include/linux/netfilter/nf_nat.h b/include/linux/netfilter/nf_nat.h
> index 1ad3659..3d1c1e7 100644
> --- a/include/linux/netfilter/nf_nat.h
> +++ b/include/linux/netfilter/nf_nat.h
> @@ -9,10 +9,16 @@
>  #define NF_NAT_RANGE_PROTO_RANDOM		(1 << 2)
>  #define NF_NAT_RANGE_PERSISTENT			(1 << 3)
>  #define NF_NAT_RANGE_PROTO_RANDOM_FULLY		(1 << 4)
> +#define NF_NAT_RANGE_PROTO_OFFSET		(1 << 5)
>  
>  #define NF_NAT_RANGE_PROTO_RANDOM_ALL		\
>  	(NF_NAT_RANGE_PROTO_RANDOM | NF_NAT_RANGE_PROTO_RANDOM_FULLY)
>  
> +#define NF_NAT_RANGE_MASK					\
> +	(NF_NAT_RANGE_MAP_IPS | NF_NAT_RANGE_PROTO_SPECIFIED |	\
> +	 NF_NAT_RANGE_PROTO_RANDOM | NF_NAT_RANGE_PERSISTENT |	\
> +	 NF_NAT_RANGE_PROTO_RANDOM_FULLY | NF_NAT_RANGE_PROTO_OFFSET)
> +
>  struct nf_nat_ipv4_range {
>  	unsigned int			flags;
>  	__be32				min_ip;
> @@ -26,12 +32,21 @@ struct nf_nat_ipv4_multi_range_compat {
>  	struct nf_nat_ipv4_range	range[1];
>  };
>  
> -struct nf_nat_range {
> +typedef struct {
>  	unsigned int			flags;
>  	union nf_inet_addr		min_addr;
>  	union nf_inet_addr		max_addr;
>  	union nf_conntrack_man_proto	min_proto;
>  	union nf_conntrack_man_proto	max_proto;
> -};
> +} nf_nat_range1;

In other existing code, eg. see extensions libxt_MARK.c, we use _v1
for this.

> +
> +typedef struct nf_nat_range {
> +	unsigned int			flags;
> +	union nf_inet_addr		min_addr;
> +	union nf_inet_addr		max_addr;
> +	union nf_conntrack_man_proto	min_proto;
> +	union nf_conntrack_man_proto	max_proto;
> +	union nf_conntrack_man_proto	base_proto;
> +} nf_nat_range2;

No typedef please as per kernel coding style.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/uapi/linux/netfilter/nf_nat.h b/include/uapi/linux/netfilter/nf_nat.h
index a33000d..0fa792a 100644
--- a/include/uapi/linux/netfilter/nf_nat.h
+++ b/include/uapi/linux/netfilter/nf_nat.h
@@ -10,6 +10,7 @@ 
 #define NF_NAT_RANGE_PROTO_RANDOM		(1 << 2)
 #define NF_NAT_RANGE_PERSISTENT			(1 << 3)
 #define NF_NAT_RANGE_PROTO_RANDOM_FULLY		(1 << 4)
+#define NF_NAT_RANGE_PROTO_OFFSET		(1 << 5)
 
 #define NF_NAT_RANGE_PROTO_RANDOM_ALL		\
 	(NF_NAT_RANGE_PROTO_RANDOM | NF_NAT_RANGE_PROTO_RANDOM_FULLY)
@@ -17,7 +18,7 @@ 
 #define NF_NAT_RANGE_MASK					\
 	(NF_NAT_RANGE_MAP_IPS | NF_NAT_RANGE_PROTO_SPECIFIED |	\
 	 NF_NAT_RANGE_PROTO_RANDOM | NF_NAT_RANGE_PERSISTENT |	\
-	 NF_NAT_RANGE_PROTO_RANDOM_FULLY)
+	 NF_NAT_RANGE_PROTO_RANDOM_FULLY | NF_NAT_RANGE_PROTO_OFFSET)
 
 struct nf_nat_ipv4_range {
 	unsigned int			flags;
@@ -32,12 +33,21 @@  struct nf_nat_ipv4_multi_range_compat {
 	struct nf_nat_ipv4_range	range[1];
 };
 
+struct nf_nat_range1 {
+	unsigned int			flags;
+	union nf_inet_addr		min_addr;
+	union nf_inet_addr		max_addr;
+	union nf_conntrack_man_proto	min_proto;
+	union nf_conntrack_man_proto	max_proto;
+};
+
 struct nf_nat_range {
 	unsigned int			flags;
 	union nf_inet_addr		min_addr;
 	union nf_inet_addr		max_addr;
 	union nf_conntrack_man_proto	min_proto;
 	union nf_conntrack_man_proto	max_proto;
+	union nf_conntrack_man_proto	base_proto;
 };
 
 #endif /* _NETFILTER_NF_NAT_H */
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index af8345f..de5c327 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -347,9 +347,10 @@  get_unique_tuple(struct nf_conntrack_tuple *tuple,
 	/* Only bother mapping if it's not already in range and unique */
 	if (!(range->flags & NF_NAT_RANGE_PROTO_RANDOM_ALL)) {
 		if (range->flags & NF_NAT_RANGE_PROTO_SPECIFIED) {
-			if (l4proto->in_range(tuple, maniptype,
-					      &range->min_proto,
-					      &range->max_proto) &&
+			if (!(range->flags & NF_NAT_RANGE_PROTO_OFFSET) &&
+			    l4proto->in_range(tuple, maniptype,
+			          &range->min_proto,
+			          &range->max_proto) &&
 			    (range->min_proto.all == range->max_proto.all ||
 			     !nf_nat_used_tuple(tuple, ct)))
 				goto out;
@@ -358,7 +359,7 @@  get_unique_tuple(struct nf_conntrack_tuple *tuple,
 		}
 	}
 
-	/* Last change: get protocol to try to obtain unique tuple. */
+	/* Last chance: get protocol to try to obtain unique tuple. */
 	l4proto->unique_tuple(l3proto, tuple, range, maniptype, ct);
 out:
 	rcu_read_unlock();
diff --git a/net/netfilter/nf_nat_proto_common.c b/net/netfilter/nf_nat_proto_common.c
index fbce552..2c30eca 100644
--- a/net/netfilter/nf_nat_proto_common.c
+++ b/net/netfilter/nf_nat_proto_common.c
@@ -80,6 +80,8 @@  void nf_nat_l4proto_unique_tuple(const struct nf_nat_l3proto *l3proto,
 						  : tuple->src.u.all);
 	} else if (range->flags & NF_NAT_RANGE_PROTO_RANDOM_FULLY) {
 		off = prandom_u32();
+	} else if (range->flags & NF_NAT_RANGE_PROTO_OFFSET) {
+		off = (ntohs(*portptr) - ntohs(range->base_proto.all));
 	} else {
 		off = *rover;
 	}
@@ -88,7 +90,8 @@  void nf_nat_l4proto_unique_tuple(const struct nf_nat_l3proto *l3proto,
 		*portptr = htons(min + off % range_size);
 		if (++i != range_size && nf_nat_used_tuple(tuple, ct))
 			continue;
-		if (!(range->flags & NF_NAT_RANGE_PROTO_RANDOM_ALL))
+		if (!(range->flags & (NF_NAT_RANGE_PROTO_RANDOM_ALL|
+					NF_NAT_RANGE_PROTO_OFFSET)))
 			*rover = off;
 		return;
 	}
diff --git a/net/netfilter/xt_nat.c b/net/netfilter/xt_nat.c
index 0fd14d1..7e8552b 100644
--- a/net/netfilter/xt_nat.c
+++ b/net/netfilter/xt_nat.c
@@ -41,6 +41,7 @@  static void xt_nat_convert_range(struct nf_nat_range *dst,
 {
 	memset(&dst->min_addr, 0, sizeof(dst->min_addr));
 	memset(&dst->max_addr, 0, sizeof(dst->max_addr));
+	memset(&dst->base_proto, 0, sizeof(dst->base_proto));
 
 	dst->flags	 = src->flags;
 	dst->min_addr.ip = src->min_ip;
@@ -85,6 +86,39 @@  xt_dnat_target_v0(struct sk_buff *skb, const struct xt_action_param *par)
 static unsigned int
 xt_snat_target_v1(struct sk_buff *skb, const struct xt_action_param *par)
 {
+	const struct nf_nat_range1 *range_v1 = par->targinfo;
+	struct nf_nat_range range = {};
+	enum ip_conntrack_info ctinfo;
+	struct nf_conn *ct;
+
+	ct = nf_ct_get(skb, &ctinfo);
+	WARN_ON(!(ct != NULL &&
+		 (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED ||
+		  ctinfo == IP_CT_RELATED_REPLY)));
+
+	memcpy(&range, range_v1, sizeof(*range_v1));
+	return nf_nat_setup_info(ct, &range, NF_NAT_MANIP_SRC);
+}
+
+static unsigned int
+xt_dnat_target_v1(struct sk_buff *skb, const struct xt_action_param *par)
+{
+	const struct nf_nat_range1 *range_v1 = par->targinfo;
+	struct nf_nat_range range = {};
+	enum ip_conntrack_info ctinfo;
+	struct nf_conn *ct;
+
+	ct = nf_ct_get(skb, &ctinfo);
+	WARN_ON(!(ct != NULL &&
+		 (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED)));
+
+	memcpy(&range, range_v1, sizeof(*range_v1));
+	return nf_nat_setup_info(ct, &range, NF_NAT_MANIP_DST);
+}
+
+static unsigned int
+xt_snat_target_v2(struct sk_buff *skb, const struct xt_action_param *par)
+{
 	const struct nf_nat_range *range = par->targinfo;
 	enum ip_conntrack_info ctinfo;
 	struct nf_conn *ct;
@@ -98,7 +132,7 @@  xt_snat_target_v1(struct sk_buff *skb, const struct xt_action_param *par)
 }
 
 static unsigned int
-xt_dnat_target_v1(struct sk_buff *skb, const struct xt_action_param *par)
+xt_dnat_target_v2(struct sk_buff *skb, const struct xt_action_param *par)
 {
 	const struct nf_nat_range *range = par->targinfo;
 	enum ip_conntrack_info ctinfo;
@@ -162,6 +196,28 @@  static struct xt_target xt_nat_target_reg[] __read_mostly = {
 				  (1 << NF_INET_LOCAL_OUT),
 		.me		= THIS_MODULE,
 	},
+	{
+		.name		= "SNAT",
+		.revision	= 2,
+		.checkentry	= xt_nat_checkentry,
+		.destroy	= xt_nat_destroy,
+		.target		= xt_snat_target_v2,
+		.targetsize	= sizeof(struct nf_nat_range),
+		.table		= "nat",
+		.hooks		= (1 << NF_INET_POST_ROUTING) |
+				  (1 << NF_INET_LOCAL_IN),
+		.me		= THIS_MODULE,
+	},
+	{
+		.name		= "DNAT",
+		.revision	= 2,
+		.target		= xt_dnat_target_v2,
+		.targetsize	= sizeof(struct nf_nat_range),
+		.table		= "nat",
+		.hooks		= (1 << NF_INET_PRE_ROUTING) |
+				  (1 << NF_INET_LOCAL_OUT),
+		.me		= THIS_MODULE,
+	},
 };
 
 static int __init xt_nat_init(void)