From patchwork Wed Apr 18 00:33:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Ahern X-Patchwork-Id: 899790 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="HfXfehvF"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40Qjm826yjz9ryr for ; Wed, 18 Apr 2018 10:34:00 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753307AbeDRAd5 (ORCPT ); Tue, 17 Apr 2018 20:33:57 -0400 Received: from mail-pl0-f68.google.com ([209.85.160.68]:45723 "EHLO mail-pl0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753186AbeDRAdu (ORCPT ); Tue, 17 Apr 2018 20:33:50 -0400 Received: by mail-pl0-f68.google.com with SMTP id k9-v6so13389pll.12 for ; Tue, 17 Apr 2018 17:33:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=1vYAw5Hn17jzW7yHJZZQc1MqnZCC56PjmeZMDIgO5eM=; b=HfXfehvFqtQwmYn4iO0+njbFy7Y+IgnRwFVRL2ksIALMVwi3hgACRKev+FdlZFzVNo 9m9oGaTrat5ZINqTdU8H4oG6MagWURRBzStjw2dArSvq5ErchOEYHR/6Xqvp8R0JnkPd BOkr/OU7MycLlN7tSH3mP93BNKquE/dLSJmsSefqhO7Az6GNR6Cja4Sp7023dqwMIrAO kmLSTu9QY4MiTxLKo7dZ7ryQ+KM/4E8mpa9rJjW0a+0rCgUQgSbTlp04G7j9+IvbyepG JKmMCFxS0UIYLhyczt5A/uzwFcXcjxOM3YSOisKNAdoekvZ9MoEPFxke63LaB4oQP6Qa vBnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=1vYAw5Hn17jzW7yHJZZQc1MqnZCC56PjmeZMDIgO5eM=; b=gSJHEVZqX4zQJl+jq10JLBL0nFMIfOYOg+deXiiMjbvTkQSnDddzKhzvwDupO0bp0w +NzG4/SJ1yMRPMq3Pm8Ws6v77s1c2LbU4uSz8MBlq2q+SWoOdIp9SvqtGqlfDe+C5D2x MiIrtLfC7SzdoSVzNNaCxdMBSaQaWPpBi6UkF5X8vQdABxwGqs6x/DcBCdXDWKY+hn14 Y8seBhTs3t2PfKQO8A+2YAodp2GARWL/8Uo+9jqu7P406+w9GqtnAEWcCC1cmDOtI+/l OzCQfw2Gp+0GzTx3fCyVmvPntoSOaUyNuvdu7v366ccnVDUw57SOpy9hA9cSL9IaBV4l qMog== X-Gm-Message-State: ALQs6tDDeoiTM0BoTWH2rLq6V2NS6OAeqgL+zLCMHZ/cw0LbgKUCTddi cja0CzSEgJrWfQevrKTUSU1+0g== X-Google-Smtp-Source: AIpwx4/Wpn6q+Tki8XphyEODz2zsPOEbDt1pnjFFXMR2t+elBwyOLZktXSgowviupMXoReI+wIZ+lQ== X-Received: by 2002:a17:902:20cb:: with SMTP id v11-v6mr3857068plg.82.1524011629622; Tue, 17 Apr 2018 17:33:49 -0700 (PDT) Received: from kenny.it.cumulusnetworks.com. (fw.cumulusnetworks.com. [216.129.126.126]) by smtp.googlemail.com with ESMTPSA id o64sm20891pfb.62.2018.04.17.17.33.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 17 Apr 2018 17:33:48 -0700 (PDT) From: David Ahern To: netdev@vger.kernel.org Cc: davem@davemloft.net, idosch@idosch.org, roopa@cumulusnetworks.com, eric.dumazet@gmail.com, weiwan@google.com, kafai@fb.com, yoshfuji@linux-ipv6.org, David Ahern Subject: [PATCH net-next v2 10/21] net/ipv6: move metrics from dst to rt6_info Date: Tue, 17 Apr 2018 17:33:16 -0700 Message-Id: <20180418003327.19992-11-dsahern@gmail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180418003327.19992-1-dsahern@gmail.com> References: <20180418003327.19992-1-dsahern@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Similar to IPv4, add fib metrics to the fib struct, which at the moment is rt6_info. Will be moved to fib6_info in a later patch. Copy metrics into dst by reference using refcount. To make the transition: - add dst_metrics to rt6_info. Default to dst_default_metrics if no metrics are passed during route add. No need for a separate pmtu entry; it can reference the MTU slot in fib6_metrics - ip6_convert_metrics allocates memory in the FIB entry and uses ip_metrics_convert to copy from netlink attribute to metrics entry - the convert metrics call is done in ip6_route_info_create simplifying the route add path + fib6_commit_metrics and fib6_copy_metrics and the temporary mx6_config are no longer needed - add fib6_metric_set helper to change the value of a metric in the fib entry since dst_metric_set can no longer be used - cow_metrics for IPv6 can drop to dst_cow_metrics_generic - rt6_dst_from_metrics_check is no longer needed - rt6_fill_node needs the FIB entry and dst as separate arguments to keep compatibility with existing output. Current dst address is renamed to dest. (to be consistent with IPv4 rt6_fill_node really should be split into 2 functions similar to fib_dump_info and rt_fill_info) - rt6_fill_node no longer needs the temporary metrics variable Signed-off-by: David Ahern --- include/net/ip6_fib.h | 17 ++-- net/core/dst.c | 1 + net/ipv6/ip6_fib.c | 66 +++++-------- net/ipv6/ndisc.c | 10 +- net/ipv6/route.c | 257 +++++++++++++++++++------------------------------- 5 files changed, 133 insertions(+), 218 deletions(-) diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index f0a88370ba95..1f8dc9d12abb 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -94,11 +94,6 @@ struct fib6_gc_args { #define FIB6_SUBTREE(fn) (rcu_dereference_protected((fn)->subtree, 1)) #endif -struct mx6_config { - const u32 *mx; - DECLARE_BITMAP(mx_valid, RTAX_MAX); -}; - /* * routing information * @@ -176,7 +171,6 @@ struct rt6_info { struct rt6_exception_bucket __rcu *rt6i_exception_bucket; u32 rt6i_metric; - u32 rt6i_pmtu; /* more non-fragment space at head required */ unsigned short rt6i_nfheader_len; u8 rt6i_protocol; @@ -185,6 +179,8 @@ struct rt6_info { should_flush:1, unused:6; + struct dst_metrics *fib6_metrics; +#define fib6_pmtu fib6_metrics->metrics[RTAX_MTU-1] struct fib6_nh fib6_nh; }; @@ -390,8 +386,7 @@ void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg), void *arg); int fib6_add(struct fib6_node *root, struct rt6_info *rt, - struct nl_info *info, struct mx6_config *mxc, - struct netlink_ext_ack *extack); + struct nl_info *info, struct netlink_ext_ack *extack); int fib6_del(struct rt6_info *rt, struct nl_info *info); void inet6_rt_notify(int event, struct rt6_info *rt, struct nl_info *info, @@ -420,6 +415,12 @@ int fib6_tables_dump(struct net *net, struct notifier_block *nb); void fib6_update_sernum(struct net *net, struct rt6_info *rt); void fib6_update_sernum_upto_root(struct net *net, struct rt6_info *rt); +void fib6_metric_set(struct rt6_info *f6i, int metric, u32 val); +static inline bool fib6_metric_locked(struct rt6_info *f6i, int metric) +{ + return !!(f6i->fib6_metrics->metrics[RTAX_LOCK - 1] & (1 << metric)); +} + #ifdef CONFIG_IPV6_MULTIPLE_TABLES int fib6_rules_init(void); void fib6_rules_cleanup(void); diff --git a/net/core/dst.c b/net/core/dst.c index 007aa0b08291..2d9b37f8944a 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -58,6 +58,7 @@ const struct dst_metrics dst_default_metrics = { */ .refcnt = REFCOUNT_INIT(1), }; +EXPORT_SYMBOL(dst_default_metrics); void dst_init(struct dst_entry *dst, struct dst_ops *ops, struct net_device *dev, int initial_ref, int initial_obsolete, diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 64b73e65f114..0d94c56c3e41 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -578,6 +578,24 @@ static int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) return res; } +void fib6_metric_set(struct rt6_info *f6i, int metric, u32 val) +{ + if (!f6i) + return; + + if (f6i->fib6_metrics == &dst_default_metrics) { + struct dst_metrics *p = kzalloc(sizeof(*p), GFP_ATOMIC); + + if (!p) + return; + + refcount_set(&p->refcnt, 1); + f6i->fib6_metrics = p; + } + + f6i->fib6_metrics->metrics[metric - 1] = val; +} + /* * Routing Table * @@ -801,38 +819,6 @@ static struct fib6_node *fib6_add_1(struct net *net, return ln; } -static void fib6_copy_metrics(u32 *mp, const struct mx6_config *mxc) -{ - int i; - - for (i = 0; i < RTAX_MAX; i++) { - if (test_bit(i, mxc->mx_valid)) - mp[i] = mxc->mx[i]; - } -} - -static int fib6_commit_metrics(struct dst_entry *dst, struct mx6_config *mxc) -{ - if (!mxc->mx) - return 0; - - if (dst->flags & DST_HOST) { - u32 *mp = dst_metrics_write_ptr(dst); - - if (unlikely(!mp)) - return -ENOMEM; - - fib6_copy_metrics(mp, mxc); - } else { - dst_init_metrics(dst, mxc->mx, false); - - /* We've stolen mx now. */ - mxc->mx = NULL; - } - - return 0; -} - static void fib6_purge_rt(struct rt6_info *rt, struct fib6_node *fn, struct net *net) { @@ -866,7 +852,7 @@ static void fib6_purge_rt(struct rt6_info *rt, struct fib6_node *fn, */ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt, - struct nl_info *info, struct mx6_config *mxc, + struct nl_info *info, struct netlink_ext_ack *extack) { struct rt6_info *leaf = rcu_dereference_protected(fn->leaf, @@ -923,7 +909,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt, rt6_clean_expires(iter); else rt6_set_expires(iter, rt->dst.expires); - iter->rt6i_pmtu = rt->rt6i_pmtu; + fib6_metric_set(iter, RTAX_MTU, rt->fib6_pmtu); return -EEXIST; } /* If we have the same destination and the same metric, @@ -1002,9 +988,6 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt, add: nlflags |= NLM_F_CREATE; - err = fib6_commit_metrics(&rt->dst, mxc); - if (err) - return err; err = call_fib6_entry_notifiers(info->nl_net, FIB_EVENT_ENTRY_ADD, @@ -1035,10 +1018,6 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt, return -ENOENT; } - err = fib6_commit_metrics(&rt->dst, mxc); - if (err) - return err; - err = call_fib6_entry_notifiers(info->nl_net, FIB_EVENT_ENTRY_REPLACE, rt, extack); @@ -1135,8 +1114,7 @@ void fib6_update_sernum_upto_root(struct net *net, struct rt6_info *rt) */ int fib6_add(struct fib6_node *root, struct rt6_info *rt, - struct nl_info *info, struct mx6_config *mxc, - struct netlink_ext_ack *extack) + struct nl_info *info, struct netlink_ext_ack *extack) { struct fib6_table *table = rt->rt6i_table; struct fib6_node *fn, *pn = NULL; @@ -1244,7 +1222,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, } #endif - err = fib6_add_rt2node(fn, rt, info, mxc, extack); + err = fib6_add_rt2node(fn, rt, info, extack); if (!err) { __fib6_update_sernum_upto_root(rt, sernum); fib6_start_gc(info->nl_net, rt); diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 3fbc3805e69b..b058ea9ecec0 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -1323,9 +1323,8 @@ static void ndisc_router_discovery(struct sk_buff *skb) ra_msg->icmph.icmp6_hop_limit) { if (in6_dev->cnf.accept_ra_min_hop_limit <= ra_msg->icmph.icmp6_hop_limit) { in6_dev->cnf.hop_limit = ra_msg->icmph.icmp6_hop_limit; - if (rt) - dst_metric_set(&rt->dst, RTAX_HOPLIMIT, - ra_msg->icmph.icmp6_hop_limit); + fib6_metric_set(rt, RTAX_HOPLIMIT, + ra_msg->icmph.icmp6_hop_limit); } else { ND_PRINTK(2, warn, "RA: Got route advertisement with lower hop_limit than minimum\n"); } @@ -1477,10 +1476,7 @@ static void ndisc_router_discovery(struct sk_buff *skb) ND_PRINTK(2, warn, "RA: invalid mtu: %d\n", mtu); } else if (in6_dev->cnf.mtu6 != mtu) { in6_dev->cnf.mtu6 = mtu; - - if (rt) - dst_metric_set(&rt->dst, RTAX_MTU, mtu); - + fib6_metric_set(rt, RTAX_MTU, mtu); rt6_mtu_change(skb->dev, mtu); } } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 3b301aafd2ed..62aafd06c35f 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -96,12 +96,11 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk, struct sk_buff *skb, u32 mtu); static void rt6_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buff *skb); -static void rt6_dst_from_metrics_check(struct rt6_info *rt); static int rt6_score_route(struct rt6_info *rt, int oif, int strict); static size_t rt6_nlmsg_size(struct rt6_info *rt); -static int rt6_fill_node(struct net *net, - struct sk_buff *skb, struct rt6_info *rt, - struct in6_addr *dst, struct in6_addr *src, +static int rt6_fill_node(struct net *net, struct sk_buff *skb, + struct rt6_info *rt, struct dst_entry *dst, + struct in6_addr *dest, struct in6_addr *src, int iif, int type, u32 portid, u32 seq, unsigned int flags); static struct rt6_info *rt6_find_cached_rt(struct rt6_info *rt, @@ -183,23 +182,6 @@ static void rt6_uncached_list_flush_dev(struct net *net, struct net_device *dev) } } -static u32 *rt6_pcpu_cow_metrics(struct rt6_info *rt) -{ - return dst_metrics_write_ptr(&rt->from->dst); -} - -static u32 *ipv6_cow_metrics(struct dst_entry *dst, unsigned long old) -{ - struct rt6_info *rt = (struct rt6_info *)dst; - - if (rt->rt6i_flags & RTF_PCPU) - return rt6_pcpu_cow_metrics(rt); - else if (rt->rt6i_flags & RTF_CACHE) - return NULL; - else - return dst_cow_metrics_generic(dst, old); -} - static inline const void *choose_neigh_daddr(struct rt6_info *rt, struct sk_buff *skb, const void *daddr) @@ -249,7 +231,7 @@ static struct dst_ops ip6_dst_ops_template = { .check = ip6_dst_check, .default_advmss = ip6_default_advmss, .mtu = ip6_mtu, - .cow_metrics = ipv6_cow_metrics, + .cow_metrics = dst_cow_metrics_generic, .destroy = ip6_dst_destroy, .ifdown = ip6_dst_ifdown, .negative_advice = ip6_negative_advice, @@ -353,6 +335,7 @@ static void rt6_info_init(struct rt6_info *rt) memset(dst + 1, 0, sizeof(*rt) - sizeof(*dst)); INIT_LIST_HEAD(&rt->rt6i_siblings); INIT_LIST_HEAD(&rt->rt6i_uncached); + rt->fib6_metrics = (struct dst_metrics *)&dst_default_metrics; } /* allocate dst with ip6_dst_ops */ @@ -395,6 +378,7 @@ static void ip6_dst_destroy(struct dst_entry *dst) struct rt6_exception_bucket *bucket; struct rt6_info *from = rt->from; struct inet6_dev *idev; + struct dst_metrics *m; dst_destroy_metrics_generic(dst); free_percpu(rt->rt6i_pcpu); @@ -411,6 +395,10 @@ static void ip6_dst_destroy(struct dst_entry *dst) kfree(bucket); } + m = rt->fib6_metrics; + if (m != &dst_default_metrics && refcount_dec_and_test(&m->refcnt)) + kfree(m); + rt->from = NULL; dst_release(&from->dst); } @@ -996,7 +984,11 @@ static void rt6_set_from(struct rt6_info *rt, struct rt6_info *from) rt->rt6i_flags &= ~RTF_EXPIRES; dst_hold(&from->dst); rt->from = from; - dst_init_metrics(&rt->dst, dst_metrics_ptr(&from->dst), true); + dst_init_metrics(&rt->dst, from->fib6_metrics->metrics, true); + if (from->fib6_metrics != &dst_default_metrics) { + rt->dst._metrics |= DST_METRICS_REFCOUNTED; + refcount_inc(&from->fib6_metrics->refcnt); + } } static void ip6_rt_copy_init(struct rt6_info *rt, struct rt6_info *ort) @@ -1140,7 +1132,6 @@ EXPORT_SYMBOL(rt6_lookup); */ static int __ip6_ins_rt(struct rt6_info *rt, struct nl_info *info, - struct mx6_config *mxc, struct netlink_ext_ack *extack) { int err; @@ -1148,7 +1139,7 @@ static int __ip6_ins_rt(struct rt6_info *rt, struct nl_info *info, table = rt->rt6i_table; spin_lock_bh(&table->tb6_lock); - err = fib6_add(&table->tb6_root, rt, info, mxc, extack); + err = fib6_add(&table->tb6_root, rt, info, extack); spin_unlock_bh(&table->tb6_lock); return err; @@ -1157,11 +1148,10 @@ static int __ip6_ins_rt(struct rt6_info *rt, struct nl_info *info, int ip6_ins_rt(struct net *net, struct rt6_info *rt) { struct nl_info info = { .nl_net = net, }; - struct mx6_config mxc = { .mx = NULL, }; /* Hold dst to account for the reference from the fib6 tree */ dst_hold(&rt->dst); - return __ip6_ins_rt(rt, &info, &mxc, NULL); + return __ip6_ins_rt(rt, &info, NULL); } static struct rt6_info *ip6_rt_cache_alloc(struct rt6_info *ort, @@ -1232,8 +1222,8 @@ static struct rt6_info *rt6_get_pcpu_route(struct rt6_info *rt) p = this_cpu_ptr(rt->rt6i_pcpu); pcpu_rt = *p; - if (pcpu_rt && ip6_hold_safe(NULL, &pcpu_rt, false)) - rt6_dst_from_metrics_check(pcpu_rt); + if (pcpu_rt) + ip6_hold_safe(NULL, &pcpu_rt, false); return pcpu_rt; } @@ -1254,7 +1244,6 @@ static struct rt6_info *rt6_make_pcpu_route(struct net *net, prev = cmpxchg(p, NULL, pcpu_rt); BUG_ON(prev); - rt6_dst_from_metrics_check(pcpu_rt); return pcpu_rt; } @@ -1384,6 +1373,16 @@ __rt6_find_exception_rcu(struct rt6_exception_bucket **bucket, return NULL; } +static unsigned int fib6_mtu(const struct rt6_info *rt) +{ + unsigned int mtu; + + mtu = rt->fib6_pmtu ? : rt->rt6i_idev->cnf.mtu6; + mtu = min_t(unsigned int, mtu, IP6_MAX_MTU); + + return mtu - lwtunnel_headroom(rt->fib6_nh.nh_lwtstate, mtu); +} + static int rt6_insert_exception(struct rt6_info *nrt, struct rt6_info *ort) { @@ -1436,7 +1435,7 @@ static int rt6_insert_exception(struct rt6_info *nrt, * Only insert this exception route if its mtu * is less than ort's mtu value. */ - if (nrt->rt6i_pmtu >= dst_mtu(&ort->dst)) { + if (dst_metric_raw(&nrt->dst, RTAX_MTU) >= fib6_mtu(ort)) { err = -EINVAL; goto out; } @@ -1673,12 +1672,12 @@ static void rt6_exceptions_update_pmtu(struct inet6_dev *idev, struct rt6_info *entry = rt6_ex->rt6i; /* For RTF_CACHE with rt6i_pmtu == 0 (i.e. a redirected - * route), the metrics of its rt->dst.from have already + * route), the metrics of its rt->from have already * been updated. */ - if (entry->rt6i_pmtu && + if (dst_metric_raw(&entry->dst, RTAX_MTU) && rt6_mtu_change_route_allowed(idev, entry, mtu)) - entry->rt6i_pmtu = mtu; + dst_metric_set(&entry->dst, RTAX_MTU, mtu); } bucket++; } @@ -1844,10 +1843,9 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table, trace_fib6_table_lookup(net, rt, table, fl6); return rt; } else if (rt->rt6i_flags & RTF_CACHE) { - if (ip6_hold_safe(net, &rt, true)) { + if (ip6_hold_safe(net, &rt, true)) dst_use_noref(&rt->dst, jiffies); - rt6_dst_from_metrics_check(rt); - } + rcu_read_unlock(); trace_fib6_table_lookup(net, rt, table, fl6); return rt; @@ -2147,13 +2145,6 @@ struct dst_entry *ip6_blackhole_route(struct net *net, struct dst_entry *dst_ori * Destination cache support functions */ -static void rt6_dst_from_metrics_check(struct rt6_info *rt) -{ - if (rt->from && - dst_metrics_ptr(&rt->dst) != dst_metrics_ptr(&rt->from->dst)) - dst_init_metrics(&rt->dst, dst_metrics_ptr(&rt->from->dst), true); -} - static struct dst_entry *rt6_check(struct rt6_info *rt, u32 cookie) { u32 rt_cookie = 0; @@ -2188,8 +2179,6 @@ static struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie) * into this function always. */ - rt6_dst_from_metrics_check(rt); - if (rt->rt6i_flags & RTF_PCPU || (unlikely(!list_empty(&rt->rt6i_uncached)) && rt->from)) return rt6_dst_from_check(rt, cookie); @@ -2242,8 +2231,8 @@ static void rt6_do_update_pmtu(struct rt6_info *rt, u32 mtu) { struct net *net = dev_net(rt->dst.dev); + dst_metric_set(&rt->dst, RTAX_MTU, mtu); rt->rt6i_flags |= RTF_MODIFIED; - rt->rt6i_pmtu = mtu; rt6_update_expires(rt, net->ipv6.sysctl.ip6_rt_mtu_expires); } @@ -2289,10 +2278,10 @@ static void __ip6_rt_update_pmtu(struct dst_entry *dst, const struct sock *sk, } else if (daddr) { struct rt6_info *nrt6; - nrt6 = ip6_rt_cache_alloc(rt6, daddr, saddr); + nrt6 = ip6_rt_cache_alloc(rt6->from, daddr, saddr); if (nrt6) { rt6_do_update_pmtu(nrt6, mtu); - if (rt6_insert_exception(nrt6, rt6)) + if (rt6_insert_exception(nrt6, rt6->from)) dst_release_immediate(&nrt6->dst); } } @@ -2533,12 +2522,8 @@ static unsigned int ip6_default_advmss(const struct dst_entry *dst) static unsigned int ip6_mtu(const struct dst_entry *dst) { - const struct rt6_info *rt = (const struct rt6_info *)dst; - unsigned int mtu = rt->rt6i_pmtu; struct inet6_dev *idev; - - if (mtu) - goto out; + unsigned int mtu; mtu = dst_metric_raw(dst, RTAX_MTU); if (mtu) @@ -2622,60 +2607,24 @@ static int ip6_dst_gc(struct dst_ops *ops) return entries > rt_max_size; } -static int ip6_convert_metrics(struct mx6_config *mxc, - const struct fib6_config *cfg) +static int ip6_convert_metrics(struct net *net, struct rt6_info *rt, + struct fib6_config *cfg) { - struct net *net = cfg->fc_nlinfo.nl_net; - bool ecn_ca = false; - struct nlattr *nla; - int remaining; - u32 *mp; - - if (!cfg->fc_mx) - return 0; - - mp = kzalloc(sizeof(u32) * RTAX_MAX, GFP_KERNEL); - if (unlikely(!mp)) - return -ENOMEM; - - nla_for_each_attr(nla, cfg->fc_mx, cfg->fc_mx_len, remaining) { - int type = nla_type(nla); - u32 val; - - if (!type) - continue; - if (unlikely(type > RTAX_MAX)) - goto err; - - if (type == RTAX_CC_ALGO) { - char tmp[TCP_CA_NAME_MAX]; + int err = 0; - nla_strlcpy(tmp, nla, sizeof(tmp)); - val = tcp_ca_get_key_by_name(net, tmp, &ecn_ca); - if (val == TCP_CA_UNSPEC) - goto err; - } else { - val = nla_get_u32(nla); - } - if (type == RTAX_HOPLIMIT && val > 255) - val = 255; - if (type == RTAX_FEATURES && (val & ~RTAX_FEATURE_MASK)) - goto err; + if (cfg->fc_mx) { + rt->fib6_metrics = kzalloc(sizeof(*rt->fib6_metrics), + GFP_KERNEL); + if (unlikely(!rt->fib6_metrics)) + return -ENOMEM; - mp[type - 1] = val; - __set_bit(type - 1, mxc->mx_valid); - } + refcount_set(&rt->fib6_metrics->refcnt, 1); - if (ecn_ca) { - __set_bit(RTAX_FEATURES - 1, mxc->mx_valid); - mp[RTAX_FEATURES - 1] |= DST_FEATURE_ECN_CA; + err = ip_metrics_convert(net, cfg->fc_mx, cfg->fc_mx_len, + rt->fib6_metrics->metrics); } - mxc->mx = mp; - return 0; - err: - kfree(mp); - return -EINVAL; + return err; } static struct rt6_info *ip6_nh_lookup_table(struct net *net, @@ -2955,6 +2904,10 @@ static struct rt6_info *ip6_route_info_create(struct fib6_config *cfg, goto out; } + err = ip6_convert_metrics(net, rt, cfg); + if (err < 0) + goto out; + if (cfg->fc_flags & RTF_EXPIRES) rt6_set_expires(rt, jiffies + clock_t_to_jiffies(cfg->fc_expires)); @@ -3078,32 +3031,16 @@ static struct rt6_info *ip6_route_info_create(struct fib6_config *cfg, return ERR_PTR(err); } -int ip6_route_add(struct fib6_config *cfg, - struct netlink_ext_ack *extack) +int ip6_route_add(struct fib6_config *cfg, struct netlink_ext_ack *extack) { - struct mx6_config mxc = { .mx = NULL, }; struct rt6_info *rt; int err; rt = ip6_route_info_create(cfg, extack); - if (IS_ERR(rt)) { - err = PTR_ERR(rt); - rt = NULL; - goto out; - } - - err = ip6_convert_metrics(&mxc, cfg); - if (err) - goto out; - - err = __ip6_ins_rt(rt, &cfg->fc_nlinfo, &mxc, extack); - - kfree(mxc.mx); + if (IS_ERR(rt)) + return PTR_ERR(rt); - return err; -out: - if (rt) - dst_release_immediate(&rt->dst); + err = __ip6_ins_rt(rt, &cfg->fc_nlinfo, extack); return err; } @@ -3157,7 +3094,7 @@ static int __ip6_del_rt_siblings(struct rt6_info *rt, struct fib6_config *cfg) if (skb) { u32 seq = info->nlh ? info->nlh->nlmsg_seq : 0; - if (rt6_fill_node(net, skb, rt, + if (rt6_fill_node(net, skb, rt, NULL, NULL, NULL, 0, RTM_DELROUTE, info->portid, seq, 0) < 0) { kfree_skb(skb); @@ -3348,7 +3285,7 @@ static void rt6_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_bu * a cached route because rt6_insert_exception() will * takes care of it */ - if (rt6_insert_exception(nrt, rt)) { + if (rt6_insert_exception(nrt, rt->from)) { dst_release_immediate(&nrt->dst); goto out; } @@ -4018,11 +3955,14 @@ static int rt6_mtu_change_route(struct rt6_info *rt, void *p_arg) update PMTU increase is a MUST. (i.e. jumbo frame) */ if (rt->fib6_nh.nh_dev == arg->dev && - !dst_metric_locked(&rt->dst, RTAX_MTU)) { + !fib6_metric_locked(rt, RTAX_MTU)) { + u32 mtu = rt->fib6_pmtu; + + if (mtu >= arg->mtu || + (mtu < arg->mtu && mtu == idev->cnf.mtu6)) + fib6_metric_set(rt, RTAX_MTU, arg->mtu); + spin_lock_bh(&rt6_exception_lock); - if (dst_metric_raw(&rt->dst, RTAX_MTU) && - rt6_mtu_change_route_allowed(idev, rt, arg->mtu)) - dst_metric_set(&rt->dst, RTAX_MTU, arg->mtu); rt6_exceptions_update_pmtu(idev, rt, arg->mtu); spin_unlock_bh(&rt6_exception_lock); } @@ -4183,7 +4123,6 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, struct rt6_nh { struct rt6_info *rt6_info; struct fib6_config r_cfg; - struct mx6_config mxc; struct list_head next; }; @@ -4198,7 +4137,8 @@ static void ip6_print_replace_route_err(struct list_head *rt6_nh_list) } } -static int ip6_route_info_append(struct list_head *rt6_nh_list, +static int ip6_route_info_append(struct net *net, + struct list_head *rt6_nh_list, struct rt6_info *rt, struct fib6_config *r_cfg) { struct rt6_nh *nh; @@ -4214,7 +4154,7 @@ static int ip6_route_info_append(struct list_head *rt6_nh_list, if (!nh) return -ENOMEM; nh->rt6_info = rt; - err = ip6_convert_metrics(&nh->mxc, r_cfg); + err = ip6_convert_metrics(net, rt, r_cfg); if (err) { kfree(nh); return err; @@ -4305,7 +4245,8 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, rt->fib6_nh.nh_weight = rtnh->rtnh_hops + 1; - err = ip6_route_info_append(&rt6_nh_list, rt, &r_cfg); + err = ip6_route_info_append(info->nl_net, &rt6_nh_list, + rt, &r_cfg); if (err) { dst_release_immediate(&rt->dst); goto cleanup; @@ -4323,7 +4264,7 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, err_nh = NULL; list_for_each_entry(nh, &rt6_nh_list, next) { rt_last = nh->rt6_info; - err = __ip6_ins_rt(nh->rt6_info, info, &nh->mxc, extack); + err = __ip6_ins_rt(nh->rt6_info, info, extack); /* save reference to first route for notification */ if (!rt_notif && !err) rt_notif = nh->rt6_info; @@ -4372,7 +4313,6 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, list_for_each_entry_safe(nh, nh_safe, &rt6_nh_list, next) { if (nh->rt6_info) dst_release_immediate(&nh->rt6_info->dst); - kfree(nh->mxc.mx); list_del(&nh->next); kfree(nh); } @@ -4546,16 +4486,16 @@ static int rt6_add_nexthop(struct sk_buff *skb, struct rt6_info *rt) return -EMSGSIZE; } -static int rt6_fill_node(struct net *net, - struct sk_buff *skb, struct rt6_info *rt, - struct in6_addr *dst, struct in6_addr *src, +static int rt6_fill_node(struct net *net, struct sk_buff *skb, + struct rt6_info *rt, struct dst_entry *dst, + struct in6_addr *dest, struct in6_addr *src, int iif, int type, u32 portid, u32 seq, unsigned int flags) { - u32 metrics[RTAX_MAX]; struct rtmsg *rtm; struct nlmsghdr *nlh; - long expires; + long expires = 0; + u32 *pmetrics; u32 table; nlh = nlmsg_put(skb, portid, seq, type, sizeof(*rtm), flags); @@ -4583,8 +4523,8 @@ static int rt6_fill_node(struct net *net, if (rt->rt6i_flags & RTF_CACHE) rtm->rtm_flags |= RTM_F_CLONED; - if (dst) { - if (nla_put_in6_addr(skb, RTA_DST, dst)) + if (dest) { + if (nla_put_in6_addr(skb, RTA_DST, dest)) goto nla_put_failure; rtm->rtm_dst_len = 128; } else if (rtm->rtm_dst_len) @@ -4612,9 +4552,9 @@ static int rt6_fill_node(struct net *net, #endif if (nla_put_u32(skb, RTA_IIF, iif)) goto nla_put_failure; - } else if (dst) { + } else if (dest) { struct in6_addr saddr_buf; - if (ip6_route_get_saddr(net, rt, dst, 0, &saddr_buf) == 0 && + if (ip6_route_get_saddr(net, rt, dest, 0, &saddr_buf) == 0 && nla_put_in6_addr(skb, RTA_PREFSRC, &saddr_buf)) goto nla_put_failure; } @@ -4626,10 +4566,8 @@ static int rt6_fill_node(struct net *net, goto nla_put_failure; } - memcpy(metrics, dst_metrics_ptr(&rt->dst), sizeof(metrics)); - if (rt->rt6i_pmtu) - metrics[RTAX_MTU - 1] = rt->rt6i_pmtu; - if (rtnetlink_put_metrics(skb, metrics) < 0) + pmetrics = dst ? dst_metrics_ptr(dst) : rt->fib6_metrics->metrics; + if (rtnetlink_put_metrics(skb, pmetrics) < 0) goto nla_put_failure; if (nla_put_u32(skb, RTA_PRIORITY, rt->rt6i_metric)) @@ -4661,9 +4599,10 @@ static int rt6_fill_node(struct net *net, goto nla_put_failure; } - expires = (rt->rt6i_flags & RTF_EXPIRES) ? rt->dst.expires - jiffies : 0; + if (rt->rt6i_flags & RTF_EXPIRES && dst) + expires = dst->expires - jiffies; - if (rtnl_put_cacheinfo(skb, &rt->dst, 0, expires, rt->dst.error) < 0) + if (rtnl_put_cacheinfo(skb, dst, 0, expires, dst ? dst->error : 0) < 0) goto nla_put_failure; if (nla_put_u8(skb, RTA_PREF, IPV6_EXTRACT_PREF(rt->rt6i_flags))) @@ -4697,10 +4636,9 @@ int rt6_dump_route(struct rt6_info *rt, void *p_arg) } } - return rt6_fill_node(net, - arg->skb, rt, NULL, NULL, 0, RTM_NEWROUTE, - NETLINK_CB(arg->cb->skb).portid, arg->cb->nlh->nlmsg_seq, - NLM_F_MULTI); + return rt6_fill_node(net, arg->skb, rt, NULL, NULL, NULL, 0, + RTM_NEWROUTE, NETLINK_CB(arg->cb->skb).portid, + arg->cb->nlh->nlmsg_seq, NLM_F_MULTI); } static int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh, @@ -4814,13 +4752,14 @@ static int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh, skb_dst_set(skb, &rt->dst); if (fibmatch) - err = rt6_fill_node(net, skb, rt, NULL, NULL, iif, + err = rt6_fill_node(net, skb, rt, NULL, NULL, NULL, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq, 0); else - err = rt6_fill_node(net, skb, rt, &fl6.daddr, &fl6.saddr, iif, - RTM_NEWROUTE, NETLINK_CB(in_skb).portid, - nlh->nlmsg_seq, 0); + err = rt6_fill_node(net, skb, rt, dst, &fl6.daddr, &fl6.saddr, + iif, RTM_NEWROUTE, + NETLINK_CB(in_skb).portid, nlh->nlmsg_seq, + 0); if (err < 0) { kfree_skb(skb); goto errout; @@ -4846,8 +4785,8 @@ void inet6_rt_notify(int event, struct rt6_info *rt, struct nl_info *info, if (!skb) goto errout; - err = rt6_fill_node(net, skb, rt, NULL, NULL, 0, - event, info->portid, seq, nlm_flags); + err = rt6_fill_node(net, skb, rt, NULL, NULL, NULL, 0, + event, info->portid, seq, nlm_flags); if (err < 0) { /* -EMSGSIZE implies BUG in rt6_nlmsg_size() */ WARN_ON(err == -EMSGSIZE);