From patchwork Mon Apr 16 15:22:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Ahern X-Patchwork-Id: 898717 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="gEvt6RIv"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40Psc525Tcz9s3G for ; Tue, 17 Apr 2018 01:24:05 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752813AbeDPPYD (ORCPT ); Mon, 16 Apr 2018 11:24:03 -0400 Received: from mail-pl0-f67.google.com ([209.85.160.67]:36925 "EHLO mail-pl0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750709AbeDPPXq (ORCPT ); Mon, 16 Apr 2018 11:23:46 -0400 Received: by mail-pl0-f67.google.com with SMTP id f7-v6so3171917plr.4 for ; Mon, 16 Apr 2018 08:23:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=HQ8g1WgnMvxxPqEotP+XuLy0BCq7gOB9ncMG9mJ4zTo=; b=gEvt6RIvLr41B4ZnyyzCiyJPhZfSrH2Pe8gEIs5L0TEfdLGVAkNbYqPCojztfj5Qcu VPGmp7IFqVyjwtUCQElf+1PXkDSNF/7G8vYhYM8FTysUGSGa3R+fc4Qr6yUbNtajO1+9 PRlSBykBfPxM6rLuphLjCuwBWVbLAmPAXBR+tIGcZlcvXrTtTDHb9LWqwWioHJ4HaWlO O1kXZrtIekWs+o1DpbwSHeCzzqoaSPrM6/B37Gzabmi9B9KVgMQyL2fJsC3u8Dyq58xN yFsLhCstfJrpljf9OZJMxx8V2Xvh2+72+Q7Hoz3D3KP2/cYTYmEaSEiJm/rgKqOgtbEp /ivw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=HQ8g1WgnMvxxPqEotP+XuLy0BCq7gOB9ncMG9mJ4zTo=; b=Zf6Z62U+UnkZVmeoARgvBHJp2k67tD6DLBuwVyPnNR79+iIyOIAFbq9KK3pZxPTthz IRlqDlN0s4JRx7HKZYzYR1M6XWgSWA1F5ScvpjAXdXgVWFPJg9kw+/5Un+u9Mg9s7gul ucSgnTTN51bYVskVqeC289SF+LF4JXOXcNate35FuNi8rKU9W6+cwvc8zInYHnPHPSPV 6GoWneGBODZlumczg5MgrLlKSq9gUCtKzIYOCNiAnb5g95n1k402h8CfW/3VmRUTqeH8 Knc+KyDzf/EaO5KNoaPqmp8xslB0VGvBxIUZI8O/nKkyrNOSYJimgVYbPN73vujpHXo8 CLEg== X-Gm-Message-State: ALQs6tAUKYCCxcWQVjLQHEPe8w69UjMDoY5qndQZKRJClmqkLDxbGSrP DTP1t4HB26TueuxAg3smJ4ttyg== X-Google-Smtp-Source: AIpwx49Bbb6zNRyO+1FoLVGhPNEhyk/YYy2VfBNt6giShJRahHAqfbfA3DLGUj2I8xxQSPbInigxWw== X-Received: by 2002:a17:902:20eb:: with SMTP id v40-v6mr15759575plg.277.1523892225603; Mon, 16 Apr 2018 08:23:45 -0700 (PDT) Received: from kenny.it.cumulusnetworks.com. (fw.cumulusnetworks.com. [216.129.126.126]) by smtp.googlemail.com with ESMTPSA id r82sm4919437pfk.187.2018.04.16.08.23.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Apr 2018 08:23:44 -0700 (PDT) From: David Ahern To: netdev@vger.kernel.org Cc: davem@davemloft.net, idosch@idosch.org, roopa@cumulusnetworks.com, eric.dumazet@gmail.com, weiwan@google.com, kafai@fb.com, yoshfuji@linux-ipv6.org, David Ahern Subject: [PATCH net-next 19/21] net/ipv6: separate handling of FIB entries from dst based routes Date: Mon, 16 Apr 2018 08:22:53 -0700 Message-Id: <20180416152255.2256-20-dsahern@gmail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180416152255.2256-1-dsahern@gmail.com> References: <20180416152255.2256-1-dsahern@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Last step before flipping the data type for FIB entries: - use fib6_info_alloc to create FIB entries in ip6_route_info_create and addrconf_dst_alloc - use fib6_info_release in place of dst_release, ip6_rt_put and rt6_release - remove the dst_hold before calling __ip6_ins_rt or ip6_del_rt - when purging routes, drop per-cpu routes - replace inc and dec of rt6i_ref with fib6_info_hold and fib6_info_release - use rt->from since it points to the FIB entry - drop references to exception bucket, fib6_metrics and per-cpu from dst entries (those are relevant for fib entries only) Signed-off-by: David Ahern --- include/net/ip6_fib.h | 4 +- include/net/ip6_route.h | 3 +- net/ipv6/addrconf.c | 18 +++-- net/ipv6/anycast.c | 7 +- net/ipv6/ip6_fib.c | 55 ++++++++++------ net/ipv6/ip6_output.c | 3 +- net/ipv6/ndisc.c | 6 +- net/ipv6/route.c | 171 +++++++++++++++++------------------------------- 8 files changed, 115 insertions(+), 152 deletions(-) diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 630392ae12d8..6c3d92bb3459 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -314,9 +314,7 @@ static inline u32 rt6_get_cookie(const struct rt6_info *rt) if (rt->rt6i_flags & RTF_PCPU || (unlikely(!list_empty(&rt->rt6i_uncached)) && rt->from)) - rt = rt->from; - - rt6_get_cookie_safe(rt, &cookie); + rt6_get_cookie_safe(rt->from, &cookie); return cookie; } diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 686cdc7f356a..57d0d45667f1 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -114,8 +114,7 @@ static inline int ip6_route_get_saddr(struct net *net, struct rt6_info *rt, unsigned int prefs, struct in6_addr *saddr) { - struct inet6_dev *idev = - rt ? ip6_dst_idev((struct dst_entry *)rt) : NULL; + struct inet6_dev *idev = rt ? rt->rt6i_idev : NULL; int err = 0; if (rt && rt->rt6i_prefsrc.plen) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 8796f00ac714..32d7571e587f 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -916,7 +916,6 @@ void inet6_ifa_finish_destroy(struct inet6_ifaddr *ifp) pr_warn("Freeing alive inet6 address %p\n", ifp); return; } - ip6_rt_put(ifp->rt); kfree_rcu(ifp, rcu); } @@ -1102,8 +1101,8 @@ ipv6_add_addr(struct inet6_dev *idev, const struct in6_addr *addr, inet6addr_notifier_call_chain(NETDEV_UP, ifa); out: if (unlikely(err < 0)) { - if (rt) - ip6_rt_put(rt); + fib6_info_release(rt); + if (ifa) { if (ifa->idev) in6_dev_put(ifa->idev); @@ -1191,7 +1190,7 @@ cleanup_prefix_route(struct inet6_ifaddr *ifp, unsigned long expires, bool del_r else { if (!(rt->rt6i_flags & RTF_EXPIRES)) fib6_set_expires(rt, expires); - ip6_rt_put(rt); + fib6_info_release(rt); } } } @@ -2375,8 +2374,7 @@ static struct rt6_info *addrconf_get_prefix_route(const struct in6_addr *pfx, continue; if ((rt->rt6i_flags & noflags) != 0) continue; - if (!dst_hold_safe(&rt->dst)) - rt = NULL; + fib6_info_hold(rt); break; } out: @@ -2688,7 +2686,7 @@ void addrconf_prefix_rcv(struct net_device *dev, u8 *opt, int len, bool sllao) addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, dev, expires, flags, GFP_ATOMIC); } - ip6_rt_put(rt); + fib6_info_release(rt); } /* Try to figure out our local address for this prefix */ @@ -3357,7 +3355,7 @@ static int fixup_permanent_addr(struct net *net, ifp->rt = rt; spin_unlock(&ifp->lock); - ip6_rt_put(prev); + fib6_info_release(prev); } if (!(ifp->flags & IFA_F_NOPREFIXROUTE)) { @@ -5640,8 +5638,8 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) ip6_del_rt(net, rt); } if (ifp->rt) { - if (dst_hold_safe(&ifp->rt->dst)) - ip6_del_rt(net, ifp->rt); + ip6_del_rt(net, ifp->rt); + ifp->rt = NULL; } rt_genid_bump_ipv6(net); break; diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index e456386fe4d5..3db8fe10322b 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -213,7 +213,7 @@ static void aca_put(struct ifacaddr6 *ac) { if (refcount_dec_and_test(&ac->aca_refcnt)) { in6_dev_put(ac->aca_idev); - dst_release(&ac->aca_rt->dst); + fib6_info_release(ac->aca_rt); kfree(ac); } } @@ -231,6 +231,7 @@ static struct ifacaddr6 *aca_alloc(struct rt6_info *rt, aca->aca_addr = *addr; in6_dev_hold(idev); aca->aca_idev = idev; + fib6_info_hold(rt); aca->aca_rt = rt; aca->aca_users = 1; /* aca_tstamp should be updated upon changes */ @@ -274,7 +275,7 @@ int __ipv6_dev_ac_inc(struct inet6_dev *idev, const struct in6_addr *addr) } aca = aca_alloc(rt, addr); if (!aca) { - ip6_rt_put(rt); + fib6_info_release(rt); err = -ENOMEM; goto out; } @@ -330,7 +331,6 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr) write_unlock_bh(&idev->lock); addrconf_leave_solict(idev, &aca->aca_addr); - dst_hold(&aca->aca_rt->dst); ip6_del_rt(dev_net(idev->dev), aca->aca_rt); aca_put(aca); @@ -358,7 +358,6 @@ void ipv6_ac_destroy_dev(struct inet6_dev *idev) addrconf_leave_solict(idev, &aca->aca_addr); - dst_hold(&aca->aca_rt->dst); ip6_del_rt(dev_net(idev->dev), aca->aca_rt); aca_put(aca); diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index d07578d84db0..4d6bd033dccd 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -170,6 +170,7 @@ struct rt6_info *fib6_info_alloc(gfp_t gfp_flags) void fib6_info_destroy(struct rt6_info *f6i) { struct rt6_exception_bucket *bucket; + struct dst_metrics *m; WARN_ON(f6i->rt6i_node); @@ -201,6 +202,10 @@ void fib6_info_destroy(struct rt6_info *f6i) if (f6i->fib6_nh.nh_dev) dev_put(f6i->fib6_nh.nh_dev); + m = f6i->fib6_metrics; + if (m != &dst_default_metrics && refcount_dec_and_test(&m->refcnt)) + kfree(m); + kfree(f6i); } EXPORT_SYMBOL_GPL(fib6_info_destroy); @@ -714,7 +719,7 @@ static struct fib6_node *fib6_add_1(struct net *net, /* clean up an intermediate node */ if (!(fn->fn_flags & RTN_RTINFO)) { RCU_INIT_POINTER(fn->leaf, NULL); - rt6_release(leaf); + fib6_info_release(leaf); /* remove null_entry in the root node */ } else if (fn->fn_flags & RTN_TL_ROOT && rcu_access_pointer(fn->leaf) == @@ -898,12 +903,32 @@ static void fib6_purge_rt(struct rt6_info *rt, struct fib6_node *fn, if (!(fn->fn_flags & RTN_RTINFO) && leaf == rt) { new_leaf = fib6_find_prefix(net, table, fn); atomic_inc(&new_leaf->rt6i_ref); + rcu_assign_pointer(fn->leaf, new_leaf); - rt6_release(rt); + fib6_info_release(rt); } fn = rcu_dereference_protected(fn->parent, lockdep_is_held(&table->tb6_lock)); } + + if (rt->rt6i_pcpu) { + int cpu; + + /* release the reference to this fib entry from + * all of its cached pcpu routes + */ + for_each_possible_cpu(cpu) { + struct rt6_info **ppcpu_rt; + struct rt6_info *pcpu_rt; + + ppcpu_rt = per_cpu_ptr(rt->rt6i_pcpu, cpu); + pcpu_rt = *ppcpu_rt; + if (pcpu_rt) { + fib6_info_release(pcpu_rt->from); + pcpu_rt->from = NULL; + } + } + } } } @@ -1099,7 +1124,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt, fib6_purge_rt(iter, fn, info->nl_net); if (rcu_access_pointer(fn->rr_ptr) == iter) fn->rr_ptr = NULL; - rt6_release(iter); + fib6_info_release(iter); if (nsiblings) { /* Replacing an ECMP route, remove all siblings */ @@ -1115,7 +1140,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt, fib6_purge_rt(iter, fn, info->nl_net); if (rcu_access_pointer(fn->rr_ptr) == iter) fn->rr_ptr = NULL; - rt6_release(iter); + fib6_info_release(iter); nsiblings--; info->nl_net->ipv6.rt6_stats->fib_rt_entries--; } else { @@ -1183,9 +1208,6 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, int replace_required = 0; int sernum = fib6_new_sernum(info->nl_net); - if (WARN_ON_ONCE(!atomic_read(&rt->dst.__refcnt))) - return -EINVAL; - if (info->nlh) { if (!(info->nlh->nlmsg_flags & NLM_F_CREATE)) allow_create = 0; @@ -1300,7 +1322,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, if (pn_leaf == rt) { pn_leaf = NULL; RCU_INIT_POINTER(pn->leaf, NULL); - atomic_dec(&rt->rt6i_ref); + fib6_info_release(rt); } if (!pn_leaf && !(pn->fn_flags & RTN_RTINFO)) { pn_leaf = fib6_find_prefix(info->nl_net, table, @@ -1312,7 +1334,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, info->nl_net->ipv6.fib6_null_entry; } #endif - atomic_inc(&pn_leaf->rt6i_ref); + fib6_info_hold(pn_leaf); rcu_assign_pointer(pn->leaf, pn_leaf); } } @@ -1334,10 +1356,6 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, (fn->fn_flags & RTN_TL_ROOT && !rcu_access_pointer(fn->leaf)))) fib6_repair_tree(info->nl_net, table, fn); - /* Always release dst as dst->__refcnt is guaranteed - * to be taken before entering this function - */ - dst_release_immediate(&rt->dst); return err; } @@ -1637,7 +1655,7 @@ static struct fib6_node *fib6_repair_tree(struct net *net, new_fn_leaf = net->ipv6.fib6_null_entry; } #endif - atomic_inc(&new_fn_leaf->rt6i_ref); + fib6_info_hold(new_fn_leaf); rcu_assign_pointer(fn->leaf, new_fn_leaf); return pn; } @@ -1693,7 +1711,7 @@ static struct fib6_node *fib6_repair_tree(struct net *net, return pn; RCU_INIT_POINTER(pn->leaf, NULL); - rt6_release(pn_leaf); + fib6_info_release(pn_leaf); fn = pn; } } @@ -1763,7 +1781,7 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn, call_fib6_entry_notifiers(net, FIB_EVENT_ENTRY_DEL, rt, NULL); if (!info->skip_notify) inet6_rt_notify(RTM_DELROUTE, rt, info, 0); - rt6_release(rt); + fib6_info_release(rt); } /* Need to own table->tb6_lock */ @@ -2261,9 +2279,8 @@ static int ipv6_route_seq_show(struct seq_file *seq, void *v) dev = rt->fib6_nh.nh_dev; seq_printf(seq, " %08x %08x %08x %08x %8s\n", - rt->rt6i_metric, atomic_read(&rt->dst.__refcnt), - rt->dst.__use, rt->rt6i_flags, - dev ? dev->name : ""); + rt->rt6i_metric, atomic_read(&rt->rt6i_ref), 0, + rt->rt6i_flags, dev ? dev->name : ""); iter->w.leaf = NULL; return 0; } diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 2e891d2c30ef..21faeb6aa224 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -972,7 +972,8 @@ static int ip6_dst_lookup_tail(struct net *net, const struct sock *sk, if (!had_dst) *dst = ip6_route_output(net, sk, fl6); rt = (*dst)->error ? NULL : (struct rt6_info *)*dst; - err = ip6_route_get_saddr(net, rt, &fl6->daddr, + err = ip6_route_get_saddr(net, rt ? rt->from : NULL, + &fl6->daddr, sk ? inet6_sk(sk)->srcprefs : 0, &fl6->saddr); if (err) diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 556717154fa3..a28857088bff 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -1283,7 +1283,7 @@ static void ndisc_router_discovery(struct sk_buff *skb) ND_PRINTK(0, err, "RA: %s got default router without neighbour\n", __func__); - ip6_rt_put(rt); + fib6_info_release(rt); return; } } @@ -1313,7 +1313,7 @@ static void ndisc_router_discovery(struct sk_buff *skb) ND_PRINTK(0, err, "RA: %s got default router without neighbour\n", __func__); - ip6_rt_put(rt); + fib6_info_release(rt); return; } neigh->flags |= NTF_ROUTER; @@ -1499,7 +1499,7 @@ static void ndisc_router_discovery(struct sk_buff *skb) ND_PRINTK(2, warn, "RA: invalid RA options\n"); } out: - ip6_rt_put(rt); + fib6_info_release(rt); if (neigh) neigh_release(neigh); } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 250fa2a5092a..a5ffb398bffb 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -351,13 +351,11 @@ static void rt6_info_init(struct rt6_info *rt) memset(dst + 1, 0, sizeof(*rt) - sizeof(*dst)); INIT_LIST_HEAD(&rt->rt6i_siblings); INIT_LIST_HEAD(&rt->rt6i_uncached); - rt->fib6_metrics = (struct dst_metrics *)&dst_default_metrics; } /* allocate dst with ip6_dst_ops */ -static struct rt6_info *__ip6_dst_alloc(struct net *net, - struct net_device *dev, - int flags) +struct rt6_info *ip6_dst_alloc(struct net *net, struct net_device *dev, + int flags) { struct rt6_info *rt = dst_alloc(&net->ipv6.ip6_dst_ops, dev, 1, DST_OBSOLETE_FORCE_CHK, flags); @@ -369,35 +367,15 @@ static struct rt6_info *__ip6_dst_alloc(struct net *net, return rt; } - -struct rt6_info *ip6_dst_alloc(struct net *net, - struct net_device *dev, - int flags) -{ - struct rt6_info *rt = __ip6_dst_alloc(net, dev, flags); - - if (rt) { - rt->rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, GFP_ATOMIC); - if (!rt->rt6i_pcpu) { - dst_release_immediate(&rt->dst); - return NULL; - } - } - - return rt; -} EXPORT_SYMBOL(ip6_dst_alloc); static void ip6_dst_destroy(struct dst_entry *dst) { struct rt6_info *rt = (struct rt6_info *)dst; - struct rt6_exception_bucket *bucket; struct rt6_info *from = rt->from; struct inet6_dev *idev; - struct dst_metrics *m; dst_destroy_metrics_generic(dst); - free_percpu(rt->rt6i_pcpu); rt6_uncached_list_del(rt); idev = rt->rt6i_idev; @@ -405,18 +383,9 @@ static void ip6_dst_destroy(struct dst_entry *dst) rt->rt6i_idev = NULL; in6_dev_put(idev); } - bucket = rcu_dereference_protected(rt->rt6i_exception_bucket, 1); - if (bucket) { - rt->rt6i_exception_bucket = NULL; - kfree(bucket); - } - - m = rt->fib6_metrics; - if (m != &dst_default_metrics && refcount_dec_and_test(&m->refcnt)) - kfree(m); rt->from = NULL; - dst_release(&from->dst); + fib6_info_release(from); } static void ip6_dst_ifdown(struct dst_entry *dst, struct net_device *dev, @@ -891,7 +860,7 @@ int rt6_route_rcv(struct net_device *dev, u8 *opt, int len, else fib6_set_expires(rt, jiffies + HZ * lifetime); - ip6_rt_put(rt); + fib6_info_release(rt); } return 0; } @@ -1010,11 +979,9 @@ static void ip6_rt_init_dst(struct rt6_info *rt, struct rt6_info *ort) static void rt6_set_from(struct rt6_info *rt, struct rt6_info *from) { - BUG_ON(from->from); - rt->rt6i_flags &= ~RTF_EXPIRES; - if (dst_hold_safe(&from->dst)) - rt->from = from; + fib6_info_hold(from); + rt->from = from; dst_init_metrics(&rt->dst, from->fib6_metrics->metrics, true); if (from->fib6_metrics != &dst_default_metrics) { rt->dst._metrics |= DST_METRICS_REFCOUNTED; @@ -1084,7 +1051,7 @@ static struct rt6_info *ip6_create_rt_rcu(struct rt6_info *rt) struct net_device *dev = rt->fib6_nh.nh_dev; struct rt6_info *nrt; - nrt = __ip6_dst_alloc(dev_net(dev), dev, flags); + nrt = ip6_dst_alloc(dev_net(dev), dev, flags); if (nrt) ip6_rt_copy_init(nrt, rt); @@ -1203,8 +1170,6 @@ int ip6_ins_rt(struct net *net, struct rt6_info *rt) { struct nl_info info = { .nl_net = net, }; - /* Hold dst to account for the reference from the fib6 tree */ - dst_hold(&rt->dst); return __ip6_ins_rt(rt, &info, NULL); } @@ -1221,7 +1186,7 @@ static struct rt6_info *ip6_rt_cache_alloc(struct rt6_info *ort, rcu_read_lock(); dev = ip6_rt_get_dev_rcu(ort); - rt = __ip6_dst_alloc(dev_net(dev), dev, 0); + rt = ip6_dst_alloc(dev_net(dev), dev, 0); rcu_read_unlock(); if (!rt) return NULL; @@ -1256,7 +1221,7 @@ static struct rt6_info *ip6_rt_pcpu_alloc(struct rt6_info *rt) rcu_read_lock(); dev = ip6_rt_get_dev_rcu(rt); - pcpu_rt = __ip6_dst_alloc(dev_net(dev), dev, flags); + pcpu_rt = ip6_dst_alloc(dev_net(dev), dev, flags); rcu_read_unlock(); if (!pcpu_rt) return NULL; @@ -1317,7 +1282,7 @@ static void rt6_remove_exception(struct rt6_exception_bucket *bucket, net = dev_net(rt6_ex->rt6i->dst.dev); rt6_ex->rt6i->rt6i_node = NULL; hlist_del_rcu(&rt6_ex->hlist); - rt6_release(rt6_ex->rt6i); + ip6_rt_put(rt6_ex->rt6i); kfree_rcu(rt6_ex, rcu); WARN_ON_ONCE(!bucket->depth); bucket->depth--; @@ -1907,17 +1872,11 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table, struct rt6_info *uncached_rt; - if (ip6_hold_safe(net, &f6i, true)) { - dst_use_noref(&f6i->dst, jiffies); - } else { - rcu_read_unlock(); - uncached_rt = f6i; - goto uncached_rt_out; - } + fib6_info_hold(f6i); rcu_read_unlock(); uncached_rt = ip6_rt_cache_alloc(f6i, &fl6->daddr, NULL); - dst_release(&rt->dst); + fib6_info_release(f6i); if (uncached_rt) { /* Uncached_rt's refcnt is taken during ip6_rt_cache_alloc() @@ -1930,7 +1889,6 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table, dst_hold(&uncached_rt->dst); } -uncached_rt_out: trace_fib6_table_lookup(net, uncached_rt, table, fl6); return uncached_rt; @@ -1939,24 +1897,12 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table, struct rt6_info *pcpu_rt; - dst_use_noref(&f6i->dst, jiffies); local_bh_disable(); pcpu_rt = rt6_get_pcpu_route(f6i); - if (!pcpu_rt) { - /* atomic_inc_not_zero() is needed when using rcu */ - if (atomic_inc_not_zero(&f6i->rt6i_ref)) { - /* No dst_hold() on rt is needed because grabbing - * rt->rt6i_ref makes sure rt can't be released. - */ - pcpu_rt = rt6_make_pcpu_route(net, f6i); - rt6_release(f6i); - } else { - /* rt is already removed from tree */ - pcpu_rt = net->ipv6.ip6_null_entry; - dst_hold(&pcpu_rt->dst); - } - } + if (!pcpu_rt) + pcpu_rt = rt6_make_pcpu_route(net, f6i); + local_bh_enable(); rcu_read_unlock(); trace_fib6_table_lookup(net, pcpu_rt, table, fl6); @@ -2193,11 +2139,26 @@ struct dst_entry *ip6_blackhole_route(struct net *net, struct dst_entry *dst_ori * Destination cache support functions */ +static bool fib6_check(struct rt6_info *f6i, u32 cookie) +{ + u32 rt_cookie = 0; + + if ((f6i && !rt6_get_cookie_safe(f6i, &rt_cookie)) || + rt_cookie != cookie) + return false; + + if (fib6_check_expired(f6i)) + return false; + + return true; +} + static struct dst_entry *rt6_check(struct rt6_info *rt, u32 cookie) { u32 rt_cookie = 0; - if (!rt6_get_cookie_safe(rt, &rt_cookie) || rt_cookie != cookie) + if ((rt->from && !rt6_get_cookie_safe(rt->from, &rt_cookie)) || + rt_cookie != cookie) return NULL; if (rt6_check_expired(rt)) @@ -2210,7 +2171,7 @@ static struct dst_entry *rt6_dst_from_check(struct rt6_info *rt, u32 cookie) { if (!__rt6_check_expired(rt) && rt->dst.obsolete == DST_OBSOLETE_FORCE_CHK && - rt6_check(rt->from, cookie)) + fib6_check(rt->from, cookie)) return &rt->dst; else return NULL; @@ -2241,7 +2202,7 @@ static struct dst_entry *ip6_negative_advice(struct dst_entry *dst) if (rt) { if (rt->rt6i_flags & RTF_CACHE) { if (rt6_check_expired(rt)) { - ip6_del_rt(dev_net(dst->dev), rt); + rt6_remove_exception_rt(rt); dst = NULL; } } else { @@ -2262,12 +2223,12 @@ static void ip6_link_failure(struct sk_buff *skb) if (rt) { if (rt->rt6i_flags & RTF_CACHE) { if (dst_hold_safe(&rt->dst)) - ip6_del_rt(dev_net(rt->dst.dev), rt); - } else { + rt6_remove_exception_rt(rt); + } else if (rt->from) { struct fib6_node *fn; rcu_read_lock(); - fn = rcu_dereference(rt->rt6i_node); + fn = rcu_dereference(rt->from->rt6i_node); if (fn && (rt->rt6i_flags & RTF_DEFAULT)) fn->fn_sernum = -1; rcu_read_unlock(); @@ -2949,13 +2910,13 @@ static struct rt6_info *ip6_route_info_create(struct fib6_config *cfg, if (!table) goto out; - rt = ip6_dst_alloc(net, NULL, - (cfg->fc_flags & RTF_ADDRCONF) ? 0 : DST_NOCOUNT); - - if (!rt) { - err = -ENOMEM; + err = -ENOMEM; + rt = fib6_info_alloc(gfp_flags); + if (!rt) goto out; - } + + if (cfg->fc_flags & RTF_ADDRCONF) + rt->dst_nocount = true; err = ip6_convert_metrics(net, rt, cfg); if (err < 0) @@ -3029,7 +2990,7 @@ static struct rt6_info *ip6_route_info_create(struct fib6_config *cfg, if (err) goto out; - rt->fib6_nh.nh_gw = rt->rt6i_gateway = cfg->fc_gateway; + rt->fib6_nh.nh_gw = cfg->fc_gateway; } err = -ENODEV; @@ -3066,7 +3027,7 @@ static struct rt6_info *ip6_route_info_create(struct fib6_config *cfg, !netif_carrier_ok(dev)) rt->fib6_nh.nh_flags |= RTNH_F_LINKDOWN; rt->fib6_nh.nh_flags |= (cfg->fc_flags & RTNH_F_ONLINK); - rt->fib6_nh.nh_dev = rt->dst.dev = dev; + rt->fib6_nh.nh_dev = dev; rt->rt6i_idev = idev; rt->rt6i_table = table; @@ -3078,9 +3039,8 @@ static struct rt6_info *ip6_route_info_create(struct fib6_config *cfg, dev_put(dev); if (idev) in6_dev_put(idev); - if (rt) - dst_release_immediate(&rt->dst); + fib6_info_release(rt); return ERR_PTR(err); } @@ -3095,6 +3055,7 @@ int ip6_route_add(struct fib6_config *cfg, gfp_t gfp_flags, return PTR_ERR(rt); err = __ip6_ins_rt(rt, &cfg->fc_nlinfo, extack); + fib6_info_release(rt); return err; } @@ -3116,7 +3077,7 @@ static int __ip6_del_rt(struct rt6_info *rt, struct nl_info *info) spin_unlock_bh(&table->tb6_lock); out: - ip6_rt_put(rt); + fib6_info_release(rt); return err; } @@ -3170,7 +3131,7 @@ static int __ip6_del_rt_siblings(struct rt6_info *rt, struct fib6_config *cfg) out_unlock: spin_unlock_bh(&table->tb6_lock); out_put: - ip6_rt_put(rt); + fib6_info_release(rt); if (skb) { rtnl_notify(skb, net, info->portid, RTNLGRP_IPV6_ROUTE, @@ -3241,8 +3202,7 @@ static int ip6_route_del(struct fib6_config *cfg, continue; if (cfg->fc_protocol && cfg->fc_protocol != rt->rt6i_protocol) continue; - if (!dst_hold_safe(&rt->dst)) - break; + fib6_info_hold(rt); rcu_read_unlock(); /* if gateway was specified only delete the one hop */ @@ -3510,12 +3470,9 @@ static void __rt6_purge_dflt_routers(struct net *net, for_each_fib6_node_rt_rcu(&table->tb6_root) { if (rt->rt6i_flags & (RTF_DEFAULT | RTF_ADDRCONF) && (!rt->rt6i_idev || rt->rt6i_idev->cnf.accept_ra != 2)) { - if (dst_hold_safe(&rt->dst)) { - rcu_read_unlock(); - ip6_del_rt(net, rt); - } else { - rcu_read_unlock(); - } + fib6_info_hold(rt); + rcu_read_unlock(); + ip6_del_rt(net, rt); goto restart; } } @@ -3665,7 +3622,7 @@ struct rt6_info *addrconf_dst_alloc(struct net *net, struct net_device *dev = idev->dev; struct rt6_info *rt; - rt = ip6_dst_alloc(net, dev, DST_NOCOUNT); + rt = fib6_info_alloc(gfp_flags); if (!rt) return ERR_PTR(-ENOMEM); @@ -3686,8 +3643,8 @@ struct rt6_info *addrconf_dst_alloc(struct net *net, } rt->fib6_nh.nh_gw = *addr; + dev_hold(dev); rt->fib6_nh.nh_dev = dev; - rt->rt6i_gateway = *addr; rt->rt6i_dst.addr = *addr; rt->rt6i_dst.plen = 128; tb_id = l3mdev_fib_table(idev->dev) ? : RT6_TABLE_LOCAL; @@ -4324,7 +4281,7 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, err = ip6_route_info_append(info->nl_net, &rt6_nh_list, rt, &r_cfg); if (err) { - dst_release_immediate(&rt->dst); + fib6_info_release(rt); goto cleanup; } @@ -4341,6 +4298,8 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, list_for_each_entry(nh, &rt6_nh_list, next) { rt_last = nh->rt6_info; err = __ip6_ins_rt(nh->rt6_info, info, extack); + fib6_info_release(nh->rt6_info); + /* save reference to first route for notification */ if (!rt_notif && !err) rt_notif = nh->rt6_info; @@ -4388,7 +4347,7 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, cleanup: list_for_each_entry_safe(nh, nh_safe, &rt6_nh_list, next) { if (nh->rt6_info) - dst_release_immediate(&nh->rt6_info->dst); + fib6_info_release(nh->rt6_info); list_del(&nh->next); kfree(nh); } @@ -4813,14 +4772,6 @@ static int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh, goto errout; } - if (fibmatch && rt->from) { - struct rt6_info *ort = rt->from; - - dst_hold(&ort->dst); - ip6_rt_put(rt); - rt = ort; - } - skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); if (!skb) { ip6_rt_put(rt); @@ -4830,12 +4781,12 @@ static int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh, skb_dst_set(skb, &rt->dst); if (fibmatch) - err = rt6_fill_node(net, skb, rt, NULL, NULL, NULL, iif, + err = rt6_fill_node(net, skb, rt->from, NULL, NULL, NULL, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq, 0); else - err = rt6_fill_node(net, skb, rt, dst, &fl6.daddr, &fl6.saddr, - iif, RTM_NEWROUTE, + err = rt6_fill_node(net, skb, rt->from, dst, + &fl6.daddr, &fl6.saddr, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq, 0); if (err < 0) {