From patchwork Mon Jul 24 01:35:45 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 792613 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=mojatatu-com.20150623.gappssmtp.com header.i=@mojatatu-com.20150623.gappssmtp.com header.b="fAb6+je5"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3xG3qt2BcMz9sMN for ; Mon, 24 Jul 2017 11:36:26 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752092AbdGXBgY (ORCPT ); Sun, 23 Jul 2017 21:36:24 -0400 Received: from mail-io0-f194.google.com ([209.85.223.194]:36839 "EHLO mail-io0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751884AbdGXBgL (ORCPT ); Sun, 23 Jul 2017 21:36:11 -0400 Received: by mail-io0-f194.google.com with SMTP id j32so973773iod.3 for ; Sun, 23 Jul 2017 18:36:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=APyf2+nkZ8bQl85dYdVgzibQZuhgqhgkv+Ghp/gCojg=; b=fAb6+je5n222l0Y9r4p3Mz2TDxqy3CaTMuWFrXymy+rg2NkGY56mG+IbAAX/wXWHMx rhgEJ6pk6oxbNl+og7vK08+5Fje0oEPwBvvuiajrpOc4S+eYOABAQcGXZoZQPDeP7++K XpBB9ZjZad9ktaf5kmJhtklQSdDZUbhJdpKrMMlO0u+qL69Lus1AHlSYgFBFxAy/OcNe l2ja9RnEI+2gg5SdEf6PGTWocpWybrtvfAjsb/Mt6Tqj8RgOghcljAwwFWA21safKPFG 2c1PdRJptM7jiEPY4USPyEyDuPy1wkWGiaLMTdvJTInZCW+y81TJc1iSruBM5In1N2D4 zBgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=APyf2+nkZ8bQl85dYdVgzibQZuhgqhgkv+Ghp/gCojg=; b=LIPjQYLg2aO9mZwciD8Av3dZsszV9oTXsfox5Z+U1RFYO4YaDOINWBSQ0LUATCAdjg b4n1klUOWrbRL9qMPoIFDmJ79/NhfjNRwbKFEmNwRTcm1/k/dga6p1E5rnQWGhICGzHd aSRnvlcrOmyQvk0Ni0IwSsu2cJxpHIoakO1WIM5dFb6IotzyoLyPP2WZ4MNwGPctPGD6 BN8XfigjAY9YABviPi7E8rql+XenPda1KmRtoCAlUL68baLQcCw3VsYYG+4oRlmiwRbb xQ1lELcDuMrb7kLUg7KzPq7uMvFINT9QHWxbB1iscu0Tqfw9JEwJ8t18fFepum1lTqmm Xn5Q== X-Gm-Message-State: AIVw110reUGN6hz/YnuqJ6uUuitsdUuIvLsFYlrfd3fkJUgRhjk0J1BI fAgDz8BNMClGX78C X-Received: by 10.107.7.6 with SMTP id 6mr15653289ioh.275.1500860170326; Sun, 23 Jul 2017 18:36:10 -0700 (PDT) Received: from localhost.localdomain (135-23-93-187.cpe.pppoe.ca. [135.23.93.187]) by smtp.gmail.com with ESMTPSA id z193sm5030691iod.65.2017.07.23.18.36.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 23 Jul 2017 18:36:09 -0700 (PDT) From: Jamal Hadi Salim X-Google-Original-From: Jamal Hadi Salim To: davem@davemloft.net Cc: netdev@vger.kernel.org, jiri@resnulli.us, xiyou.wangcong@gmail.com, dsahern@gmail.com, eric.dumazet@gmail.com, mrv@mojatatu.com, simon.horman@netronome.com, alex.aring@gmail.com, Jamal Hadi Salim Subject: [PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch Date: Sun, 23 Jul 2017 21:35:45 -0400 Message-Id: <1500860146-26970-4-git-send-email-jhs@emojatatu.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1500860146-26970-1-git-send-email-jhs@emojatatu.com> References: <1500860146-26970-1-git-send-email-jhs@emojatatu.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Jamal Hadi Salim When you dump hundreds of thousands of actions, getting only 32 per dump batch even when the socket buffer and memory allocations allow is inefficient. With this change, the user will get as many as possibly fitting within the given constraints available to the kernel. The top level action TLV space is extended. An attribute TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON is set by the user indicating the user is capable of processing these large dumps. Older user space which doesnt set this flag doesnt get the large (than 32) batches. The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many actions are put in a single batch. As such user space app knows how long to iterate (independent of the type of action being dumped) instead of hardcoded maximum of 32 thus maintaining backward compat. Some results dumping 1.5M actions below: first an unpatched tc which doesnt understand these features... prompt$ time -p tc actions ls action gact | grep index | wc -l 1500000 real 1388.43 user 2.07 sys 1386.79 Now lets see a patched tc which sets the correct flags when requesting a dump: prompt$ time -p updatedtc actions ls action gact | grep index | wc -l 1500000 real 178.13 user 2.02 sys 176.96 That is about 8x performance improvement for tc app which sets its receive buffer to about 32K. Signed-off-by: Jamal Hadi Salim --- include/net/netlink.h | 12 +++++++++++ include/uapi/linux/rtnetlink.h | 22 +++++++++++++++++-- net/sched/act_api.c | 48 +++++++++++++++++++++++++++++++++--------- 3 files changed, 70 insertions(+), 12 deletions(-) diff --git a/include/net/netlink.h b/include/net/netlink.h index e33d1fb..87c0b15 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -1207,6 +1207,18 @@ static inline struct in6_addr nla_get_in6_addr(const struct nlattr *nla) } /** + * nla_get_bitfield_32 - return payload of 32 bitfield attribute + * @nla: nla_bitfield_32 attribute + */ +static inline struct nla_bitfield_32 nla_get_bitfield_32(const struct nlattr *nla) +{ + struct nla_bitfield_32 tmp; + + nla_memcpy(&tmp, nla, sizeof(tmp)); + return tmp; +} + +/** * nla_memdup - duplicate attribute memory (kmemdup) * @src: netlink attribute to duplicate from * @gfp: GFP mask diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index d148505..bfa80a6 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -683,10 +683,28 @@ struct tcamsg { unsigned char tca__pad1; unsigned short tca__pad2; }; + +enum { + TCA_ROOT_UNSPEC, + TCA_ROOT_TAB, +#define TCA_ACT_TAB TCA_ROOT_TAB +#define TCAA_MAX TCA_ROOT_TAB + TCA_ROOT_FLAGS, + TCA_ROOT_COUNT, + __TCA_ROOT_MAX, +#define TCA_ROOT_MAX (__TCA_ROOT_MAX - 1) +}; + #define TA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct tcamsg)))) #define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg)) -#define TCA_ACT_TAB 1 /* attr type must be >=1 */ -#define TCAA_MAX 1 +/* tcamsg flags stored in attribute TCA_ROOT_FLAGS + * + * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO + * actions in a dump. All dump responses will contain the number of actions + * being dumped stored in for user app's consumption in TCA_ROOT_COUNT + * + */ +#define TCA_FLAG_LARGE_DUMP_ON (1 << 0) /* New extended info filters for IFLA_EXT_MASK */ #define RTEXT_FILTER_VF (1 << 0) diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 848370e..15d6c46 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -110,6 +110,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb, struct netlink_callback *cb) { int err = 0, index = -1, i = 0, s_i = 0, n_i = 0; + u32 act_flags = cb->args[2]; struct nlattr *nest; spin_lock_bh(&hinfo->lock); @@ -138,14 +139,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb, } nla_nest_end(skb, nest); n_i++; - if (n_i >= TCA_ACT_MAX_PRIO) + if (!(act_flags & TCA_FLAG_LARGE_DUMP_ON) && + n_i >= TCA_ACT_MAX_PRIO) goto done; } } done: spin_unlock_bh(&hinfo->lock); - if (n_i) + if (n_i) { cb->args[0] += n_i; + if (act_flags & TCA_FLAG_LARGE_DUMP_ON) + cb->args[1] = n_i; + } return n_i; nla_put_failure: @@ -1068,11 +1073,17 @@ static int tcf_action_add(struct net *net, struct nlattr *nla, return tcf_add_notify(net, n, &actions, portid); } +static u32 tcaa_root_flags_allowed = TCA_FLAG_LARGE_DUMP_ON; +static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = { + [TCA_ROOT_FLAGS] = { .type = NLA_BITFIELD_32, + .validation_data = &tcaa_root_flags_allowed }, +}; + static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n, struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); - struct nlattr *tca[TCAA_MAX + 1]; + struct nlattr *tca[TCA_ROOT_MAX + 1]; u32 portid = skb ? NETLINK_CB(skb).portid : 0; int ret = 0, ovr = 0; @@ -1080,7 +1091,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n, !netlink_capable(skb, CAP_NET_ADMIN)) return -EPERM; - ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL, + ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ROOT_MAX, NULL, extack); if (ret < 0) return ret; @@ -1121,16 +1132,12 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n, return ret; } -static struct nlattr *find_dump_kind(const struct nlmsghdr *n) +static struct nlattr *find_dump_kind(struct nlattr **nla) { struct nlattr *tb1, *tb2[TCA_ACT_MAX + 1]; struct nlattr *tb[TCA_ACT_MAX_PRIO + 1]; - struct nlattr *nla[TCAA_MAX + 1]; struct nlattr *kind; - if (nlmsg_parse(n, sizeof(struct tcamsg), nla, TCAA_MAX, - NULL, NULL) < 0) - return NULL; tb1 = nla[TCA_ACT_TAB]; if (tb1 == NULL) return NULL; @@ -1157,8 +1164,18 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb) struct tc_action_ops *a_o; int ret = 0; struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh); - struct nlattr *kind = find_dump_kind(cb->nlh); + struct nla_bitfield_32 fb; + struct nlattr *count_attr = NULL; + struct nlattr *tb[TCA_ROOT_MAX + 1]; + struct nlattr *kind = NULL; + u32 act_count = 0; + + ret = nlmsg_parse(cb->nlh, sizeof(struct tcamsg), tb, TCA_ROOT_MAX, + tcaa_policy, NULL); + if (ret < 0) + return ret; + kind = find_dump_kind(tb); if (kind == NULL) { pr_info("tc_dump_action: action bad kind\n"); return 0; @@ -1168,14 +1185,22 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb) if (a_o == NULL) return 0; + if (tb[TCA_ROOT_FLAGS]) + fb = nla_get_bitfield_32(tb[TCA_ROOT_FLAGS]); + nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, cb->nlh->nlmsg_type, sizeof(*t), 0); if (!nlh) goto out_module_put; + + cb->args[2] = fb.nla_value; t = nlmsg_data(nlh); t->tca_family = AF_UNSPEC; t->tca__pad1 = 0; t->tca__pad2 = 0; + count_attr = nla_reserve(skb, TCA_ROOT_COUNT, sizeof(u32)); + if (!count_attr) + goto out_module_put; nest = nla_nest_start(skb, TCA_ACT_TAB); if (nest == NULL) @@ -1188,6 +1213,9 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb) if (ret > 0) { nla_nest_end(skb, nest); ret = skb->len; + act_count = cb->args[1]; + memcpy(nla_data(count_attr), &act_count, sizeof(u32)); + cb->args[1] = 0; } else nlmsg_trim(skb, b);