diff mbox

fib_rules: add minimum prefix length

Message ID 20130723220221.GP10216@zirkel.wertarbyte.de
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Stefan Tomanek July 23, 2013, 10:02 p.m. UTC
This change adds a minimum prefix length to the structures of routing rules.
If a rule is added with a minimum prefix length >0, only routes meeting this
threshold will be considered. Any other (more general) routing table entries
will be ignored.

When configuring a system with multiple network uplinks and default routes, it
is often convinient to reference the main routing table multiple times - but
omitting the default route. Using this patch and a modified "ip" utility, this
can be achieved by using the following command sequence:

  $ ip route add table secuplink default via 10.42.23.1

  $ ip rule add pref 100            table main prefixlength 1
  $ ip rule add pref 150 fwmark 0xA table secuplink

With this setup, packets marked 0xA will be processed by the additional routing
table "secuplink", but only if no suitable route in the main routing table can
be found. By using a minimal prefixlength of 1, the default route (/0) of the
table "main" is hidden to packets processed by rule 100; packets traveling to
destinations with more specific routing entries are processed as usual.

Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
---
 include/net/fib_rules.h        |    1 +
 include/uapi/linux/fib_rules.h |    2 +-
 net/core/fib_rules.c           |    5 +++++
 net/ipv4/fib_rules.c           |   11 ++++++++++-
 4 files changed, 17 insertions(+), 2 deletions(-)

Comments

Stephen Hemminger July 23, 2013, 10:38 p.m. UTC | #1
On Wed, 24 Jul 2013 00:02:21 +0200
Stefan Tomanek <stefan.tomanek@wertarbyte.de> wrote:

> diff --git a/include/uapi/linux/fib_rules.h b/include/uapi/linux/fib_rules.h
> index 51da65b..59cd31b 100644
> --- a/include/uapi/linux/fib_rules.h
> +++ b/include/uapi/linux/fib_rules.h
> @@ -45,7 +45,7 @@ enum {
>  	FRA_FLOW,	/* flow/class id */
>  	FRA_UNUSED6,
>  	FRA_UNUSED7,
> -	FRA_UNUSED8,
> +	FRA_TABLE_PREFIXLEN_MIN,
>  	FRA_TABLE,	/* Extended table id */
>  	FRA_FWMASK,	/* mask for netfilter mark */
>  	FRA_OIFNAME,

Not sure if reusing an entry or adding a new value is better
to retain compatibility. Also don't you have to update FRA_GENERIC_POLICY?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Tomanek July 23, 2013, 10:52 p.m. UTC | #2
Dies schrieb Stephen Hemminger (stephen@networkplumber.org):

> > --- a/include/uapi/linux/fib_rules.h
> > +++ b/include/uapi/linux/fib_rules.h
> > @@ -45,7 +45,7 @@ enum {
> >  	FRA_FLOW,	/* flow/class id */
> >  	FRA_UNUSED6,
> >  	FRA_UNUSED7,
> > -	FRA_UNUSED8,
> > +	FRA_TABLE_PREFIXLEN_MIN,
> >  	FRA_TABLE,	/* Extended table id */
> >  	FRA_FWMASK,	/* mask for netfilter mark */
> >  	FRA_OIFNAME,
> 
> Not sure if reusing an entry or adding a new value is better
> to retain compatibility.

Well, neither am I; my first changesets did in fact act a new value,
but I then switched to reusing an existing one. I am open to suggestions
from people having more experience with the existing code base.

> Also don't you have to update FRA_GENERIC_POLICY?

Well, I'm not sure - as I said, I was glad that I was able to add this
functionality (which I was craving for quite some time now) to the kernel,
so please feel free to point out (and possibly iron out) any issues with it.

Thank you very much for your response
Stefan Tomanek
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa July 24, 2013, 2:14 a.m. UTC | #3
Hi!

On Wed, Jul 24, 2013 at 12:02:21AM +0200, Stefan Tomanek wrote:
> This change adds a minimum prefix length to the structures of routing rules.
> If a rule is added with a minimum prefix length >0, only routes meeting this
> threshold will be considered. Any other (more general) routing table entries
> will be ignored.
> 
> When configuring a system with multiple network uplinks and default routes, it
> is often convinient to reference the main routing table multiple times - but
> omitting the default route. Using this patch and a modified "ip" utility, this
> can be achieved by using the following command sequence:

Yeah, it is sometimes pretty hideous to set up, especially if one uses
ppp stuff and such. But I am unsure if this change does actually improve
that considerable. Static setups should be easily doable right now and for
ppp/vpn stuff, I fear, it would still lack a bit of flexibility.

Off-topic:
I had the idea of supporting per process routing tables to deal with
that: ip route exec table foo pppd ...
So that every change of pppd and childs would only affect table foo (the
details could be hairy but currently not enough time to find out).

> diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
> index 26aa65d..94e9051 100644
> --- a/net/ipv4/fib_rules.c
> +++ b/net/ipv4/fib_rules.c
> @@ -95,8 +95,17 @@ static int fib4_rule_action(struct fib_rule *rule, struct flowi *flp,
>  		goto errout;
>  
>  	err = fib_table_lookup(tbl, &flp->u.ip4, (struct fib_result *) arg->result, arg->flags);
> -	if (err > 0)
> +	if (err > 0) {
>  		err = -EAGAIN;
> +		goto errout;
> +	}
> +	/* do not accept result if the route does not meet the required prefix length */
> +	if (arg->result) {
> +		if (((struct fib_result *)arg->result)->prefixlen < rule->table_prefixlen_min) {
> +			err = -EAGAIN;
> +			goto errout;
> +		}
> +	}
>  errout:
>  	return err;
>  }

I would try to factor the prefixlen_min check out into a
e.g. fib4_rule_constrain function for which a new field in fib_rules_ops
needs to be created as callback. Also it would be nice to have IPv6
support, too. ;)

Greetings,

  Hannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Tomanek July 24, 2013, 7:57 a.m. UTC | #4
Dies schrieb Hannes Frederic Sowa (hannes@stressinduktion.org):

> Yeah, it is sometimes pretty hideous to set up, especially if one uses
> ppp stuff and such. But I am unsure if this change does actually improve
> that considerable. Static setups should be easily doable right now and for
> ppp/vpn stuff, I fear, it would still lack a bit of flexibility.

Well, it does work for me. I am using a dynamic PPP uplink and an OpenVPN
tunnel where some marked traffic is guided through.

To achieve this, I always had to configure pppd with "nodefaultroute" just to
add the default route it manually in a separate table, complicating the
configuration process.

With my patches added, I can just reference a "masked" version of the main
routing table at first, ignoring the default route placed there by pppd:

echo "vpn" >> /etc/iproute2/rt_tables
ip route add table vpn default via tun0
ip rule add pref 100 lookup main prefixlength 0
ip rule add pref 200 fwmark 0xA lookup vpn

             |
             V
[ table main prefixlength >0 ]
             |
             V
       <fwmark 0xA?>   ->  [ table vpn ]
             |                   |
	     |    ,--------------ยด
	     V    V
        [ table main ]

That way, there is no need to reconfigure pppd, dhclient etc. If a specific
route of the main table matches, it will we used. If the main table just points
to the default route (prefixlengt == 0), it will be ignored and the packet
travels to the next rule. In the end, the complete main table might still be
consulted, including the previously shunned default route.

Works great and requires little to no hacking around distribution specific
network scripts.

> I would try to factor the prefixlen_min check out into a
> e.g. fib4_rule_constrain function for which a new field in fib_rules_ops
> needs to be created as callback. Also it would be nice to have IPv6
> support, too. ;)

Why not, sure. Working solutions today, better solutions tomorrow :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Tomanek July 25, 2013, 4:29 p.m. UTC | #5
Dies schrieb Hannes Frederic Sowa (hannes@stressinduktion.org):

> I would try to factor the prefixlen_min check out into a
> e.g. fib4_rule_constrain function for which a new field in fib_rules_ops
> needs to be created as callback. Also it would be nice to have IPv6
> support, too. ;)

I was working on my patchset again and considered your suggestion; however I'm
not sure whether factoring out the constraints into a separate function is
actually that useful, since they are only called from one specific location for
each protocol; can you think of another useful application?

I however did like the idea of adding IPv6 support, so I did -
I'll post the new patch later on.

Thanks for your feedback.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa July 25, 2013, 6:17 p.m. UTC | #6
On Thu, Jul 25, 2013 at 06:29:32PM +0200, Stefan Tomanek wrote:
> Dies schrieb Hannes Frederic Sowa (hannes@stressinduktion.org):
> 
> > I would try to factor the prefixlen_min check out into a
> > e.g. fib4_rule_constrain function for which a new field in fib_rules_ops
> > needs to be created as callback. Also it would be nice to have IPv6
> > support, too. ;)
> 
> I was working on my patchset again and considered your suggestion; however I'm
> not sure whether factoring out the constraints into a separate function is
> actually that useful, since they are only called from one specific location for
> each protocol; can you think of another useful application?

I don't have a strong opinion on that but I do find that it does better fit into
the design of fib_rules and will be easier to extend in future IMHO.

> I however did like the idea of adding IPv6 support, so I did -
> I'll post the new patch later on.

Cool, thanks.

Greetings,

  Hannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Tomanek July 25, 2013, 6:28 p.m. UTC | #7
Dies schrieb Hannes Frederic Sowa (hannes@stressinduktion.org):

> > I however did like the idea of adding IPv6 support, so I did -
> > I'll post the new patch later on.
> 
> Cool, thanks.

I've just completed the patch and sent it to the list, and I also placed the
changes in my github account, just in case anyone wants to take a quick peek:
https://github.com/wertarbyte/linux/compare/prefixlength
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Collins July 25, 2013, 9:46 p.m. UTC | #8
On Thu, Jul 25, 2013 at 12:28 PM, Stefan Tomanek
<stefan.tomanek@wertarbyte.de> wrote:
> Dies schrieb Hannes Frederic Sowa (hannes@stressinduktion.org):
>
>> > I however did like the idea of adding IPv6 support, so I did -
>> > I'll post the new patch later on.
>>
>> Cool, thanks.
>
> I've just completed the patch and sent it to the list, and I also placed the
> changes in my github account, just in case anyone wants to take a quick peek:
> https://github.com/wertarbyte/linux/compare/prefixlength
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

> +  if (arg->result) { ...

This NULL check seems unnecessary (fib_table_lookup doesn't check).

I like the idea, but I agree with Hannes that it'd be nice to split
this out into a fib constraint callback.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Tomanek July 25, 2013, 10:11 p.m. UTC | #9
Dies schrieb Andrew Collins (bsderandrew@gmail.com):

> > I've just completed the patch and sent it to the list, and I also placed the
> > changes in my github account, just in case anyone wants to take a quick peek:
> > https://github.com/wertarbyte/linux/compare/prefixlength
> 
> > +  if (arg->result) { ...
> 
> This NULL check seems unnecessary (fib_table_lookup doesn't check).

I wasn't sure if it was possible for fib_table_lookup to put a NULL pointer there,
so I wanted to make sure following it with ->prefixlen was possible. 

> I like the idea, but I agree with Hannes that it'd be nice to split
> this out into a fib constraint callback.

I'm still not convinced that creating such a callback that will never be called
from outside fib{4,}_rule_action() is worth the effort, but my insight into the code
is still limited; if it's the right thing to do [tm], I won't stand in the way :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Tomanek July 26, 2013, 10:54 a.m. UTC | #10
Dies schrieb Andrew Collins (bsderandrew@gmail.com):

> I like the idea, but I agree with Hannes that it'd be nice to split
> this out into a fib constraint callback.

Here we go again:
https://github.com/wertarbyte/linux/compare/fib_rule_suppress
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index e361f48..b098a9d 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -18,6 +18,7 @@  struct fib_rule {
 	u32			pref;
 	u32			flags;
 	u32			table;
+	u8			table_prefixlen_min;
 	u8			action;
 	u32			target;
 	struct fib_rule __rcu	*ctarget;
diff --git a/include/uapi/linux/fib_rules.h b/include/uapi/linux/fib_rules.h
index 51da65b..59cd31b 100644
--- a/include/uapi/linux/fib_rules.h
+++ b/include/uapi/linux/fib_rules.h
@@ -45,7 +45,7 @@  enum {
 	FRA_FLOW,	/* flow/class id */
 	FRA_UNUSED6,
 	FRA_UNUSED7,
-	FRA_UNUSED8,
+	FRA_TABLE_PREFIXLEN_MIN,
 	FRA_TABLE,	/* Extended table id */
 	FRA_FWMASK,	/* mask for netfilter mark */
 	FRA_OIFNAME,
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 2173544..8a15621 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -337,6 +337,8 @@  static int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh)
 	rule->action = frh->action;
 	rule->flags = frh->flags;
 	rule->table = frh_get_table(frh, tb);
+	if (tb[FRA_TABLE_PREFIXLEN_MIN])
+		rule->table_prefixlen_min = nla_get_u8(tb[FRA_TABLE_PREFIXLEN_MIN]);
 
 	if (!tb[FRA_PRIORITY] && ops->default_pref)
 		rule->pref = ops->default_pref(ops);
@@ -523,6 +525,7 @@  static inline size_t fib_rule_nlmsg_size(struct fib_rules_ops *ops,
 			 + nla_total_size(IFNAMSIZ) /* FRA_OIFNAME */
 			 + nla_total_size(4) /* FRA_PRIORITY */
 			 + nla_total_size(4) /* FRA_TABLE */
+			 + nla_total_size(1) /* FRA_TABLE_PREFIXLEN_MIN */
 			 + nla_total_size(4) /* FRA_FWMARK */
 			 + nla_total_size(4); /* FRA_FWMASK */
 
@@ -548,6 +551,8 @@  static int fib_nl_fill_rule(struct sk_buff *skb, struct fib_rule *rule,
 	frh->table = rule->table;
 	if (nla_put_u32(skb, FRA_TABLE, rule->table))
 		goto nla_put_failure;
+	if (nla_put_u8(skb, FRA_TABLE_PREFIXLEN_MIN, rule->table_prefixlen_min))
+		goto nla_put_failure;
 	frh->res1 = 0;
 	frh->res2 = 0;
 	frh->action = rule->action;
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 26aa65d..94e9051 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -95,8 +95,17 @@  static int fib4_rule_action(struct fib_rule *rule, struct flowi *flp,
 		goto errout;
 
 	err = fib_table_lookup(tbl, &flp->u.ip4, (struct fib_result *) arg->result, arg->flags);
-	if (err > 0)
+	if (err > 0) {
 		err = -EAGAIN;
+		goto errout;
+	}
+	/* do not accept result if the route does not meet the required prefix length */
+	if (arg->result) {
+		if (((struct fib_result *)arg->result)->prefixlen < rule->table_prefixlen_min) {
+			err = -EAGAIN;
+			goto errout;
+		}
+	}
 errout:
 	return err;
 }