diff mbox

ipv6: Add support for RTA_PREFSRC

Message ID 1301903804.31789.234.camel@localhost
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Daniel Walter April 4, 2011, 7:56 a.m. UTC
On Fri, 2011-04-01 at 20:46 -0700, David Miller wrote:
> You can't change the layout of "struct in6_rtmsg", as that structure
> is explicitly exported to user space and changing it will break every
> application out there.

Hi,

I've kicked support for setting the preferred source via ioctl,
to keep "struct in6_rtmsg" untouched.
This reduces the RTA_PREFSRC support to netlink only, unless
we break the struct.

Do you see any other way around this problem?

regards,
daniel

---








Daniel Walter
Software Engineer

Barracuda Networks AG
Eduard-Bodem-Gasse 1
6020 Innsbruck
Austria

Phone: +43 (0) 508 100
Fax: +43 508 100 20
eMail: mailto:DWalter@barracuda.com
Web: www.barracudanetworks.com, www.phion.com


Barracuda Networks solutions are now available as virtual appliances. 
Visit www.barracudanetworks.com/vx for more information.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller April 7, 2011, 1:37 a.m. UTC | #1
From: Daniel Walter <dwalter@barracuda.com>

Date: Mon, 4 Apr 2011 09:56:44 +0200

> On Fri, 2011-04-01 at 20:46 -0700, David Miller wrote:

>> You can't change the layout of "struct in6_rtmsg", as that structure

>> is explicitly exported to user space and changing it will break every

>> application out there.

> 

> Hi,

> 

> I've kicked support for setting the preferred source via ioctl,

> to keep "struct in6_rtmsg" untouched.

> This reduces the RTA_PREFSRC support to netlink only, unless

> we break the struct.

> 

> Do you see any other way around this problem?


This is fine, adding new feature support to deprecated things like
the ioctl routing calls is undesirable anyways.

Since you do the prefsrc extraction in at least two places, make a
helper function that does the whole "if prefsrc.plen use prefsrc, else
use ipv6_dev_get_saddr()"

This would be akin to ipv4's FIB_RES_PREFSRC
Florian Westphal April 7, 2011, 9:25 p.m. UTC | #2
David Miller <davem@davemloft.net> wrote:
> From: Daniel Walter <dwalter@barracuda.com>
> Date: Mon, 4 Apr 2011 09:56:44 +0200
> > This reduces the RTA_PREFSRC support to netlink only, unless
> > we break the struct.
> 
[..]
> Since you do the prefsrc extraction in at least two places, make a
> helper function that does the whole "if prefsrc.plen use prefsrc, else
> use ipv6_dev_get_saddr()"
> 
> This would be akin to ipv4's FIB_RES_PREFSRC

OK, I'll bite.

Whats wrong with using ipv6 addrlabels to pick the desired address,
and, if there is a problem, why is it not fixable?

Just wondering...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Walter April 11, 2011, 7:22 a.m. UTC | #3
On Thu, 2011-04-07 at 14:25 -0700, Florian Westphal wrote:
> David Miller <davem@davemloft.net> wrote:
> > From: Daniel Walter <dwalter@barracuda.com>
> > Date: Mon, 4 Apr 2011 09:56:44 +0200
> > > This reduces the RTA_PREFSRC support to netlink only, unless
> > > we break the struct.
> > 
> [..]
> > Since you do the prefsrc extraction in at least two places, make a
> > helper function that does the whole "if prefsrc.plen use prefsrc, else
> > use ipv6_dev_get_saddr()"
> > 
> > This would be akin to ipv4's FIB_RES_PREFSRC
> 
> OK, I'll bite.
> 
> Whats wrong with using ipv6 addrlabels to pick the desired address,
> and, if there is a problem, why is it not fixable?
> 
> Just wondering...
Hi,

As far as I've understood addrlabels, they allow me to define the
overall preferred source. As soon I want to select the default src
only for a given route, addrlabels cannot do the job.
for example:
ip-addresses on eth0
2001:0DB8::1/64
2001:0DB8::2/64
routes
2001:0DB8::/64
2001:0DB8:0:dead::/64 via 2001:0DB8::1234/64

addrlabel allow me to set the default source address to 2001:0DB8::1 for
both routes. with pref_src selection one is able to set the default
outgoing address for each route to the needed address, which may be
2001:0DB8::1 for the first route, and 2001:0DB8::2 for the remaining.

Please feel free to correct me if I misunderstood something.

--


Barracuda Networks solutions are now available as virtual appliances. 
Visit www.barracudanetworks.com/vx for more information.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal April 11, 2011, 7:58 a.m. UTC | #4
Daniel Walter <dwalter@barracuda.com> wrote:
> On Thu, 2011-04-07 at 14:25 -0700, Florian Westphal wrote:
> > David Miller <davem@davemloft.net> wrote:
> > > This would be akin to ipv4's FIB_RES_PREFSRC
> > 
> > OK, I'll bite.
> > 
> > Whats wrong with using ipv6 addrlabels to pick the desired address,
> > and, if there is a problem, why is it not fixable?
> > 
> > Just wondering...
> Hi,
> 
> As far as I've understood addrlabels, they allow me to define the
> overall preferred source. As soon I want to select the default src
> only for a given route, addrlabels cannot do the job.
> for example:
> ip-addresses on eth0
> 2001:0DB8::1/64
> 2001:0DB8::2/64
> routes
> 2001:0DB8::/64
> 2001:0DB8:0:dead::/64 via 2001:0DB8::1234/64
> 
> addrlabel allow me to set the default source address to 2001:0DB8::1 for
> both routes. with pref_src selection one is able to set the default
> outgoing address for each route to the needed address, which may be
> 2001:0DB8::1 for the first route, and 2001:0DB8::2 for the remaining.
> 
> Please feel free to correct me if I misunderstood something.

ip addrlabel add label 1000 prefix 2001:db8::1
ip addrlabel add label 1000 prefix 2001:db8::/64
ip addrlabel add label 1001 prefix 2001:db8::2
ip addrlabel add label 1001 prefix 2001:0DB8:0:dead::/64

ip route add 2001:db8:0:dead::/64 via 2001:db8::1234

This should tell the stack to pick 2001:db8::1 as the
source address when talking to 2001:db8::/64, and to
use 2001:db8::2 when talking to 2001:0DB8:0:dead::42.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index bc3cde0..98348d5 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -42,6 +42,7 @@  struct fib6_config {
 
 	struct in6_addr	fc_dst;
 	struct in6_addr	fc_src;
+	struct in6_addr	fc_prefsrc;
 	struct in6_addr	fc_gateway;
 
 	unsigned long	fc_expires;
@@ -107,6 +108,7 @@  struct rt6_info {
 	struct rt6key			rt6i_dst ____cacheline_aligned_in_smp;
 	u32				rt6i_flags;
 	struct rt6key			rt6i_src;
+	struct rt6key			rt6i_prefsrc;
 	u32				rt6i_metric;
 	u32				rt6i_peer_genid;
 
diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index c850e5f..2b37c20 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -141,6 +141,7 @@  struct rt6_rtnl_dump_arg {
 extern int rt6_dump_route(struct rt6_info *rt, void *p_arg);
 extern void rt6_ifdown(struct net *net, struct net_device *dev);
 extern void rt6_mtu_change(struct net_device *dev, unsigned mtu);
+extern void rt6_remove_prefsrc(struct inet6_ifaddr *ifp);
 
 
 /*
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 3daaf3c..26f9e14 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -825,6 +825,8 @@  static void ipv6_del_addr(struct inet6_ifaddr *ifp)
 		dst_release(&rt->dst);
 	}
 
+	/* clean up prefsrc entries */
+	rt6_remove_prefsrc(ifp);
 out:
 	in6_ifa_put(ifp);
 }
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 1820887..e2d8463 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -930,10 +930,14 @@  static int ip6_dst_lookup_tail(struct sock *sk,
 		goto out_err_release;
 
 	if (ipv6_addr_any(&fl6->saddr)) {
-		err = ipv6_dev_get_saddr(net, ip6_dst_idev(*dst)->dev,
-					 &fl6->daddr,
-					 sk ? inet6_sk(sk)->srcprefs : 0,
-					 &fl6->saddr);
+		struct rt6_info *rt = (struct rt6_info *) *dst;
+		if (rt->rt6i_prefsrc.plen)
+			ipv6_addr_copy(&fl6->saddr, &rt->rt6i_prefsrc.addr);
+		else
+			err = ipv6_dev_get_saddr(net, ip6_dst_idev(*dst)->dev,
+						 &fl6->daddr,
+						 sk ? inet6_sk(sk)->srcprefs : 0,
+						 &fl6->saddr);
 		if (err)
 			goto out_err_release;
 	}
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 843406f..f59dbae 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1325,6 +1325,20 @@  int ip6_route_add(struct fib6_config *cfg)
 	if (dev == NULL)
 		goto out;
 
+	/* check if prefsrc is set */
+	if (!ipv6_addr_any(&cfg->fc_prefsrc)) {
+		struct in6_addr saddr_buf;
+		ipv6_addr_copy(&saddr_buf, &cfg->fc_prefsrc);
+		if (!ipv6_chk_addr(net, &saddr_buf, dev, 0)) {
+			printk(KERN_DEBUG "invalid pref_src\n");
+			err = -EINVAL;
+			goto out;
+		}
+		ipv6_addr_copy(&rt->rt6i_prefsrc.addr, &cfg->fc_prefsrc);
+		rt->rt6i_prefsrc.plen = 128;
+	} else
+		rt->rt6i_prefsrc.plen = 0;
+
 	if (cfg->fc_flags & (RTF_GATEWAY | RTF_NONEXTHOP)) {
 		rt->rt6i_nexthop = __neigh_lookup_errno(&nd_tbl, &rt->rt6i_gateway, dev);
 		if (IS_ERR(rt->rt6i_nexthop)) {
@@ -2037,6 +2051,39 @@  struct rt6_info *addrconf_dst_alloc(struct inet6_dev *idev,
 	return rt;
 }
 
+/* remove deleted ip from prefsrc entries */
+struct arg_dev_net_ip {
+	struct net_device *dev;
+	struct net *net;
+	struct in6_addr *addr;
+};
+
+static int fib6_remove_prefsrc(struct rt6_info *rt, void *arg)
+{
+	struct net_device *dev = ((struct arg_dev_net_ip *)arg)->dev;
+	struct net *net = ((struct arg_dev_net_ip *)arg)->net;
+	struct in6_addr *addr = ((struct arg_dev_net_ip *)arg)->addr;
+
+	if (((void *)rt->rt6i_dev == dev || dev == NULL) &&
+	    rt != net->ipv6.ip6_null_entry &&
+	    ipv6_addr_equal(addr, &rt->rt6i_prefsrc.addr)) {
+		/* remove prefsrc entry */
+		rt->rt6i_prefsrc.plen = 0;
+	}
+	return 0;
+}
+
+void rt6_remove_prefsrc(struct inet6_ifaddr *ifp)
+{
+	struct net *net = dev_net(ifp->idev->dev);
+	struct arg_dev_net_ip adni = {
+		.dev = ifp->idev->dev,
+		.net = net,
+		.addr = &ifp->addr,
+	};
+	fib6_clean_all(net, fib6_remove_prefsrc, 0, &adni);
+}
+
 struct arg_dev_net {
 	struct net_device *dev;
 	struct net *net;
@@ -2183,6 +2230,9 @@  static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh,
 		nla_memcpy(&cfg->fc_src, tb[RTA_SRC], plen);
 	}
 
+	if (tb[RTA_PREFSRC])
+		nla_memcpy(&cfg->fc_prefsrc, tb[RTA_PREFSRC], 16);
+
 	if (tb[RTA_OIF])
 		cfg->fc_ifindex = nla_get_u32(tb[RTA_OIF]);
 
@@ -2325,11 +2375,22 @@  static int rt6_fill_node(struct net *net,
 #endif
 			NLA_PUT_U32(skb, RTA_IIF, iif);
 	} else if (dst) {
-		struct inet6_dev *idev = ip6_dst_idev(&rt->dst);
 		struct in6_addr saddr_buf;
-		if (ipv6_dev_get_saddr(net, idev ? idev->dev : NULL,
-				       dst, 0, &saddr_buf) == 0)
+		if (rt->rt6i_prefsrc.plen) {
+			ipv6_addr_copy(&saddr_buf, &rt->rt6i_prefsrc.addr);
 			NLA_PUT(skb, RTA_PREFSRC, 16, &saddr_buf);
+		} else {
+			struct inet6_dev *idev = ip6_dst_idev(&rt->dst);
+			if (ipv6_dev_get_saddr(net, idev ? idev->dev : NULL,
+					       dst, 0, &saddr_buf) == 0)
+				NLA_PUT(skb, RTA_PREFSRC, 16, &saddr_buf);
+		}
+	}
+
+	if (rt->rt6i_prefsrc.plen) {
+		struct in6_addr saddr_buf;
+		ipv6_addr_copy(&saddr_buf, &rt->rt6i_prefsrc.addr);
+		NLA_PUT(skb, RTA_PREFSRC, 16, &saddr_buf);
 	}
 
 	if (rtnetlink_put_metrics(skb, dst_metrics_ptr(&rt->dst)) < 0)