diff mbox

[net,v3] net: ipv6: allow explicitly choosing optimistic addresses

Message ID 1422859156-30691-1-git-send-email-ek@google.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Erik Kline Feb. 2, 2015, 6:39 a.m. UTC
RFC 4429 ("Optimistic DAD") states that optimistic addresses
should be treated as deprecated addresses.  From section 2.1:

   Unless noted otherwise, components of the IPv6 protocol stack
   should treat addresses in the Optimistic state equivalently to
   those in the Deprecated state, indicating that the address is
   available for use but should not be used if another suitable
   address is available.

Optimistic addresses are indeed avoided when other addresses are
available (i.e. at source address selection time), but they have
not heretofore been available for things like explicit bind() and
sendmsg() with struct in6_pktinfo, etc.

This change makes optimistic addresses treated more like
deprecated addresses than tentative ones.

Signed-off-by: Erik Kline <ek@google.com>
---
 include/net/addrconf.h |  3 +++
 net/ipv6/addrconf.c    | 17 +++++++++++++++--
 net/ipv6/ndisc.c       |  4 +++-
 3 files changed, 21 insertions(+), 3 deletions(-)

Comments

Lorenzo Colitti Feb. 3, 2015, 3:40 p.m. UTC | #1
On Mon, Feb 2, 2015 at 3:39 PM, Erik Kline <ek@google.com> wrote:
> @@ -1526,8 +1534,13 @@ int ipv6_chk_addr(struct net *net, const struct in6_addr *addr,
>         hlist_for_each_entry_rcu(ifp, &inet6_addr_lst[hash], addr_lst) {
>                 if (!net_eq(dev_net(ifp->idev->dev), net))
>                         continue;
> +               /* Permit optimistic addresses, but only under explicitly
> +                * defined circumstances.
> +                */

I don't think this comment adds much of value, the code right below it
is pretty clear.

> +               bool optimistic_ok = (ifp->flags & IFA_F_OPTIMISTIC) &&
> +                                    (banned_flags == IFA_F_TENTATIVE);

Not sure if this can happen in any real use case, but I think that
technically this is incorrect if banned_flags contains both
IFA_F_TENTATIVE and other flags that aren't IFA_F_OPTIMISTIC. For
example, suppose banned_flags = IFA_F_TENTATIVE | IFA_F_PERMANENT. In
that case, I think the code would reject an address with
IFA_F_TENTATIVE | IFA_F_OPTIMISTIC. You might be able to fix that
using something like:

        int ifp_flags;

        ...

               ifp_flags = ifp->flags;
               if (ifp_flags & IFA_F_OPTIMISTIC) ifp_flags &= ~IFA_F_TENTATIVE;
               if (ipv6_addr_equal(&ifp->addr, addr) &&
                 !(ifp_flags & banned_flags) &&

Though I think that at this point your original formulation (the one
that treated IFA_F_TENTATIVE specially and did not pass it in via
banned_flags) might be faster/simpler/better.

Hannes, any better ideas?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Erik Kline Feb. 4, 2015, 10:38 a.m. UTC | #2
>> +               bool optimistic_ok = (ifp->flags & IFA_F_OPTIMISTIC) &&
>> +                                    (banned_flags == IFA_F_TENTATIVE);
>
> Not sure if this can happen in any real use case, but I think that
> technically this is incorrect if banned_flags contains both
> IFA_F_TENTATIVE and other flags that aren't IFA_F_OPTIMISTIC. For
> example, suppose banned_flags = IFA_F_TENTATIVE | IFA_F_PERMANENT. In
> that case, I think the code would reject an address with
> IFA_F_TENTATIVE | IFA_F_OPTIMISTIC. You might be able to fix that

Exactly what I was trying to address, but I tried to choose the
uber-conservative option and require folks to explicitly list when
optimistic should be treated distinctly from tentative.

That of course, wasn't really future-proof and someone might not know
or simply forget to update this check as needed.

>                ifp_flags = ifp->flags;
>                if (ifp_flags & IFA_F_OPTIMISTIC) ifp_flags &= ~IFA_F_TENTATIVE;

I think this is cleaner.  I've got a version now that incorporates this.

One more thing I just noticed, though: if the interface and the
address matches the supplied arguments but we don't like the flags we
keep on processing all other addresses on that interface I think.  Is
that right?  (Easy to fix in a follow on patch.)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Colitti Feb. 4, 2015, 10:47 a.m. UTC | #3
On Wed, Feb 4, 2015 at 7:38 PM, Erik Kline <ek@google.com> wrote:
> One more thing I just noticed, though: if the interface and the
> address matches the supplied arguments but we don't like the flags we
> keep on processing all other addresses on that interface I think.  Is
> that right?  (Easy to fix in a follow on patch.)

What else would the code do? It loops over all the addresses on the
system in hash table order, not interface-by-interface.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Erik Kline Feb. 4, 2015, 10:54 a.m. UTC | #4
On Wed, Feb 4, 2015 at 7:47 PM, Lorenzo Colitti <lorenzo@google.com> wrote:
> On Wed, Feb 4, 2015 at 7:38 PM, Erik Kline <ek@google.com> wrote:
>> One more thing I just noticed, though: if the interface and the
>> address matches the supplied arguments but we don't like the flags we
>> keep on processing all other addresses on that interface I think.  Is
>> that right?  (Easy to fix in a follow on patch.)
>
> What else would the code do? It loops over all the addresses on the
> system in hash table order, not interface-by-interface.

<off_topic>
Well, I was wondering about:

        hlist_for_each_entry_rcu(...) {
                if (!net_eq(dev_net(ifp->idev->dev), net))
                        continue;
                if (!ipv6_addr_equal(&ifp->addr, addr))
                        continue;

                if (flags_testing && etc) { unlock; return 1; }
                else break;
        }

But the correctness of this depends on where else an identical IPv6
address might be assigned.  Hard for me to say without more dedicated
time to study it.
</off_topic>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Colitti Feb. 4, 2015, 11:05 a.m. UTC | #5
On Wed, Feb 4, 2015 at 7:54 PM, Erik Kline <ek@google.com> wrote:
>                 if (flags_testing && etc) { unlock; return 1; }
>                 else break;

Oh, I see. I suppose it's possible to fail early if strict == 1, yes.
AIUI it's not possible for the same address to be assigned twice to
the same interface with different prefix lengths (unlike IPv4, which
allows this). But it's possible for the same address to be assigned to
multiple interfaces, so you can't do it if strict == 0.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index d13573b..80456f7 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -62,6 +62,9 @@  int addrconf_set_dstaddr(struct net *net, void __user *arg);
 
 int ipv6_chk_addr(struct net *net, const struct in6_addr *addr,
 		  const struct net_device *dev, int strict);
+int ipv6_chk_addr_and_flags(struct net *net, const struct in6_addr *addr,
+			    const struct net_device *dev, int strict,
+			    u32 banned_flags);
 
 #if defined(CONFIG_IPV6_MIP6) || defined(CONFIG_IPV6_MIP6_MODULE)
 int ipv6_chk_home_addr(struct net *net, const struct in6_addr *addr);
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f7c8bbe..e9b795f 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1519,6 +1519,14 @@  static int ipv6_count_addresses(struct inet6_dev *idev)
 int ipv6_chk_addr(struct net *net, const struct in6_addr *addr,
 		  const struct net_device *dev, int strict)
 {
+	return ipv6_chk_addr_and_flags(net, addr, dev, strict, IFA_F_TENTATIVE);
+}
+EXPORT_SYMBOL(ipv6_chk_addr);
+
+int ipv6_chk_addr_and_flags(struct net *net, const struct in6_addr *addr,
+			    const struct net_device *dev, int strict,
+			    u32 banned_flags)
+{
 	struct inet6_ifaddr *ifp;
 	unsigned int hash = inet6_addr_hash(addr);
 
@@ -1526,8 +1534,13 @@  int ipv6_chk_addr(struct net *net, const struct in6_addr *addr,
 	hlist_for_each_entry_rcu(ifp, &inet6_addr_lst[hash], addr_lst) {
 		if (!net_eq(dev_net(ifp->idev->dev), net))
 			continue;
+		/* Permit optimistic addresses, but only under explicitly
+		 * defined circumstances.
+		 */
+		bool optimistic_ok = (ifp->flags & IFA_F_OPTIMISTIC) &&
+				     (banned_flags == IFA_F_TENTATIVE);
 		if (ipv6_addr_equal(&ifp->addr, addr) &&
-		    !(ifp->flags&IFA_F_TENTATIVE) &&
+		    (!(ifp->flags&banned_flags) || optimistic_ok) &&
 		    (dev == NULL || ifp->idev->dev == dev ||
 		     !(ifp->scope&(IFA_LINK|IFA_HOST) || strict))) {
 			rcu_read_unlock_bh();
@@ -1538,7 +1551,7 @@  int ipv6_chk_addr(struct net *net, const struct in6_addr *addr,
 	rcu_read_unlock_bh();
 	return 0;
 }
-EXPORT_SYMBOL(ipv6_chk_addr);
+EXPORT_SYMBOL(ipv6_chk_addr_and_flags);
 
 static bool ipv6_chk_same_addr(struct net *net, const struct in6_addr *addr,
 			       struct net_device *dev)
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 6828667..113fc6c 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -655,7 +655,9 @@  static void ndisc_solicit(struct neighbour *neigh, struct sk_buff *skb)
 	struct in6_addr *target = (struct in6_addr *)&neigh->primary_key;
 	int probes = atomic_read(&neigh->probes);
 
-	if (skb && ipv6_chk_addr(dev_net(dev), &ipv6_hdr(skb)->saddr, dev, 1))
+	if (skb && ipv6_chk_addr_and_flags(dev_net(dev), &ipv6_hdr(skb)->saddr,
+					   dev, 1,
+					   IFA_F_TENTATIVE|IFA_F_OPTIMISTIC))
 		saddr = &ipv6_hdr(skb)->saddr;
 	probes -= NEIGH_VAR(neigh->parms, UCAST_PROBES);
 	if (probes < 0) {