[{"id":2365110,"web_url":"http://patchwork.ozlabs.org/comment/2365110/","msgid":"<e869424c-eaf5-d8b1-dfde-86958f437538@iogearbox.net>","list_archive_url":null,"date":"2020-02-18T22:34:47","subject":"Re: [PATCH 06/18] bpf: Add bpf_ksym_tree tree","submitter":{"id":65705,"url":"http://patchwork.ozlabs.org/api/people/65705/","name":"Daniel Borkmann","email":"daniel@iogearbox.net"},"content":"On 2/16/20 8:29 PM, Jiri Olsa wrote:\n> The bpf_tree is used both for kallsyms iterations and searching\n> for exception tables of bpf programs, which is needed only for\n> bpf programs.\n> \n> Adding bpf_ksym_tree that will hold symbols for all bpf_prog\n> bpf_trampoline and bpf_dispatcher objects and keeping bpf_tree\n> only for bpf_prog objects to keep it fast.\n> \n> Signed-off-by: Jiri Olsa <jolsa@kernel.org>\n> ---\n>   include/linux/bpf.h |  1 +\n>   kernel/bpf/core.c   | 60 ++++++++++++++++++++++++++++++++++++++++-----\n>   2 files changed, 55 insertions(+), 6 deletions(-)\n> \n> diff --git a/include/linux/bpf.h b/include/linux/bpf.h\n> index f1174d24c185..5d6649cdc3df 100644\n> --- a/include/linux/bpf.h\n> +++ b/include/linux/bpf.h\n> @@ -468,6 +468,7 @@ struct bpf_ksym {\n>   \tunsigned long\t\t end;\n>   \tchar\t\t\t name[KSYM_NAME_LEN];\n>   \tstruct list_head\t lnode;\n> +\tstruct latch_tree_node\t tnode;\n>   };\n>   \n>   enum bpf_tramp_prog_type {\n> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c\n> index 604093d2153a..9fb08b4d01f7 100644\n> --- a/kernel/bpf/core.c\n> +++ b/kernel/bpf/core.c\n> @@ -606,8 +606,46 @@ static const struct latch_tree_ops bpf_tree_ops = {\n>   \t.comp\t= bpf_tree_comp,\n>   };\n>   \n> +static unsigned long\n> +bpf_get_ksym_start(struct latch_tree_node *n)\n> +{\n> +\tconst struct bpf_ksym *ksym;\n> +\n> +\tksym = container_of(n, struct bpf_ksym, tnode);\n> +\treturn ksym->start;\n\nSmall nit, can be simplified to:\n\n\treturn container_of(n, struct bpf_ksym, tnode)->start;\n\n> +}\n> +\n> +static bool\n> +bpf_ksym_tree_less(struct latch_tree_node *a,\n> +\t\t   struct latch_tree_node *b)\n> +{\n> +\treturn bpf_get_ksym_start(a) < bpf_get_ksym_start(b);\n> +}\n> +\n> +static int\n> +bpf_ksym_tree_comp(void *key, struct latch_tree_node *n)\n> +{\n> +\tunsigned long val = (unsigned long)key;\n> +\tconst struct bpf_ksym *ksym;\n> +\n> +\tksym = container_of(n, struct bpf_ksym, tnode);\n> +\n> +\tif (val < ksym->start)\n> +\t\treturn -1;\n> +\tif (val >= ksym->end)\n> +\t\treturn  1;\n> +\n> +\treturn 0;\n> +}\n> +\n> +static const struct latch_tree_ops bpf_ksym_tree_ops = {\n> +\t.less\t= bpf_ksym_tree_less,\n> +\t.comp\t= bpf_ksym_tree_comp,\n> +};\n> +\n>   static DEFINE_SPINLOCK(bpf_lock);\n>   static LIST_HEAD(bpf_kallsyms);\n> +static struct latch_tree_root bpf_ksym_tree __cacheline_aligned;\n>   static struct latch_tree_root bpf_tree __cacheline_aligned;\n\nYou mention in your commit description performance being the reason on why\nwe need two latch trees. Can't we maintain everything just in a single one?\n\nWhat does \"to keep it fast\" mean here in absolute numbers that would affect\noverall system performance? It feels a bit like premature optimization with\nthe above rationale as-is.\n\nIf it is about differentiating the different bpf_ksym symbols for some of the\nkallsym handling functions (?), can't we simply add an enum bpf_ksym_type {\nBPF_SYM_PROGRAM, BPF_SYM_TRAMPOLINE, BPF_SYM_DISPATCHER } instead, but still\nmaintain them all in a single latch tree?\n\nThanks,\nDaniel","headers":{"Return-Path":"<bpf-owner@vger.kernel.org>","X-Original-To":"incoming-bpf@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming-bpf@bilbo.ozlabs.org","Authentication-Results":["ozlabs.org; spf=none (no SPF record)\n\tsmtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67;\n\thelo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dmarc=none (p=none dis=none)\n\theader.from=iogearbox.net"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 48MbJZ18WXz9sRN\n\tfor <incoming-bpf@patchwork.ozlabs.org>;\n\tWed, 19 Feb 2020 09:34:54 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1726595AbgBRWex (ORCPT\n\t<rfc822;incoming-bpf@patchwork.ozlabs.org>);\n\tTue, 18 Feb 2020 17:34:53 -0500","from www62.your-server.de ([213.133.104.62]:47746 \"EHLO\n\twww62.your-server.de\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1726415AbgBRWew (ORCPT <rfc822; bpf@vger.kernel.org>); \n\tTue, 18 Feb 2020 17:34:52 -0500","from sslproxy02.your-server.de ([78.47.166.47])\n\tby www62.your-server.de with esmtpsa\n\t(TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.89_1)\n\t(envelope-from <daniel@iogearbox.net>)\n\tid 1j4BS4-0005lA-Hh; Tue, 18 Feb 2020 23:34:48 +0100","from [85.7.42.192] (helo=pc-9.home)\n\tby sslproxy02.your-server.de with esmtpsa\n\t(TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92)\n\t(envelope-from <daniel@iogearbox.net>)\n\tid 1j4BS4-000OU6-1k; Tue, 18 Feb 2020 23:34:48 +0100"],"Subject":"Re: [PATCH 06/18] bpf: Add bpf_ksym_tree tree","To":"Jiri Olsa <jolsa@kernel.org>, Alexei Starovoitov <ast@kernel.org>","Cc":"netdev@vger.kernel.org, bpf@vger.kernel.org, Andrii Nakryiko\n\t<andriin@fb.com>, Yonghong Song <yhs@fb.com>, Song Liu\n\t<songliubraving@fb.com>,         Martin KaFai Lau <kafai@fb.com>,\n\tJakub Kicinski <kuba@kernel.org>, David Miller <davem@redhat.com>,\n\t=?utf-8?b?QmrDtnJuIFTDtnBlbA==?= <bjorn.topel@intel.com>,\n\tJohn Fastabend <john.fastabend@gmail.com>, Jesper Dangaard Brouer\n\t<hawk@kernel.org>,         Arnaldo Carvalho de Melo <acme@redhat.com>","References":"<20200216193005.144157-1-jolsa@kernel.org>\n\t<20200216193005.144157-7-jolsa@kernel.org>","From":"Daniel Borkmann <daniel@iogearbox.net>","Message-ID":"<e869424c-eaf5-d8b1-dfde-86958f437538@iogearbox.net>","Date":"Tue, 18 Feb 2020 23:34:47 +0100","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101\n\tThunderbird/60.7.2","MIME-Version":"1.0","In-Reply-To":"<20200216193005.144157-7-jolsa@kernel.org>","Content-Type":"text/plain; charset=windows-1252; format=flowed","Content-Language":"en-US","Content-Transfer-Encoding":"7bit","X-Authenticated-Sender":"daniel@iogearbox.net","X-Virus-Scanned":"Clear (ClamAV 0.102.1/25727/Tue Feb 18 15:05:00 2020)","Sender":"bpf-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<bpf.vger.kernel.org>","X-Mailing-List":"bpf@vger.kernel.org"}},{"id":2365408,"web_url":"http://patchwork.ozlabs.org/comment/2365408/","msgid":"<20200219084132.GC439238@krava>","list_archive_url":null,"date":"2020-02-19T08:41:32","subject":"Re: [PATCH 06/18] bpf: Add bpf_ksym_tree tree","submitter":{"id":2492,"url":"http://patchwork.ozlabs.org/api/people/2492/","name":"Jiri Olsa","email":"jolsa@redhat.com"},"content":"On Tue, Feb 18, 2020 at 11:34:47PM +0100, Daniel Borkmann wrote:\n> On 2/16/20 8:29 PM, Jiri Olsa wrote:\n> > The bpf_tree is used both for kallsyms iterations and searching\n> > for exception tables of bpf programs, which is needed only for\n> > bpf programs.\n> > \n> > Adding bpf_ksym_tree that will hold symbols for all bpf_prog\n> > bpf_trampoline and bpf_dispatcher objects and keeping bpf_tree\n> > only for bpf_prog objects to keep it fast.\n> > \n> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>\n> > ---\n> >   include/linux/bpf.h |  1 +\n> >   kernel/bpf/core.c   | 60 ++++++++++++++++++++++++++++++++++++++++-----\n> >   2 files changed, 55 insertions(+), 6 deletions(-)\n> > \n> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h\n> > index f1174d24c185..5d6649cdc3df 100644\n> > --- a/include/linux/bpf.h\n> > +++ b/include/linux/bpf.h\n> > @@ -468,6 +468,7 @@ struct bpf_ksym {\n> >   \tunsigned long\t\t end;\n> >   \tchar\t\t\t name[KSYM_NAME_LEN];\n> >   \tstruct list_head\t lnode;\n> > +\tstruct latch_tree_node\t tnode;\n> >   };\n> >   enum bpf_tramp_prog_type {\n> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c\n> > index 604093d2153a..9fb08b4d01f7 100644\n> > --- a/kernel/bpf/core.c\n> > +++ b/kernel/bpf/core.c\n> > @@ -606,8 +606,46 @@ static const struct latch_tree_ops bpf_tree_ops = {\n> >   \t.comp\t= bpf_tree_comp,\n> >   };\n> > +static unsigned long\n> > +bpf_get_ksym_start(struct latch_tree_node *n)\n> > +{\n> > +\tconst struct bpf_ksym *ksym;\n> > +\n> > +\tksym = container_of(n, struct bpf_ksym, tnode);\n> > +\treturn ksym->start;\n> \n> Small nit, can be simplified to:\n> \n> \treturn container_of(n, struct bpf_ksym, tnode)->start;\n\nok\n\n> \n> > +}\n> > +\n> > +static bool\n> > +bpf_ksym_tree_less(struct latch_tree_node *a,\n> > +\t\t   struct latch_tree_node *b)\n> > +{\n> > +\treturn bpf_get_ksym_start(a) < bpf_get_ksym_start(b);\n> > +}\n> > +\n> > +static int\n> > +bpf_ksym_tree_comp(void *key, struct latch_tree_node *n)\n> > +{\n> > +\tunsigned long val = (unsigned long)key;\n> > +\tconst struct bpf_ksym *ksym;\n> > +\n> > +\tksym = container_of(n, struct bpf_ksym, tnode);\n> > +\n> > +\tif (val < ksym->start)\n> > +\t\treturn -1;\n> > +\tif (val >= ksym->end)\n> > +\t\treturn  1;\n> > +\n> > +\treturn 0;\n> > +}\n> > +\n> > +static const struct latch_tree_ops bpf_ksym_tree_ops = {\n> > +\t.less\t= bpf_ksym_tree_less,\n> > +\t.comp\t= bpf_ksym_tree_comp,\n> > +};\n> > +\n> >   static DEFINE_SPINLOCK(bpf_lock);\n> >   static LIST_HEAD(bpf_kallsyms);\n> > +static struct latch_tree_root bpf_ksym_tree __cacheline_aligned;\n> >   static struct latch_tree_root bpf_tree __cacheline_aligned;\n> \n> You mention in your commit description performance being the reason on why\n> we need two latch trees. Can't we maintain everything just in a single one?\n> \n> What does \"to keep it fast\" mean here in absolute numbers that would affect\n> overall system performance? It feels a bit like premature optimization with\n> the above rationale as-is.\n> \n> If it is about differentiating the different bpf_ksym symbols for some of the\n> kallsym handling functions (?), can't we simply add an enum bpf_ksym_type {\n> BPF_SYM_PROGRAM, BPF_SYM_TRAMPOLINE, BPF_SYM_DISPATCHER } instead, but still\n> maintain them all in a single latch tree?\n\nthe motivation is that up to now stored in the tree only bpf_prog objects,\nand the tree was used both for kallsym and exception table lookups\n(in search_bpf_extables function)\n\nbut if we'd add trampoline and fispatcher objects to the same tree, then\nthe exception table lookups would suffer from having to traverse more\nobjects within the search, hence the separation of the trees\n\nI don't have any performance numbers supporting this, just the rationale\nabove\n\njirka","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org; spf=none (no SPF record)\n\tsmtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67;\n\thelo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org;\n\tdmarc=pass (p=none dis=none) header.from=redhat.com","ozlabs.org; dkim=pass (1024-bit key;\n\tunprotected) header.d=redhat.com header.i=@redhat.com\n\theader.a=rsa-sha256 header.s=mimecast20190719\n\theader.b=g6P1ThNT; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 48Mrms3m5Zz9sRk\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n\tWed, 19 Feb 2020 19:41:49 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1726195AbgBSIlr (ORCPT\n\t<rfc822;patchwork-incoming-netdev@ozlabs.org>);\n\tWed, 19 Feb 2020 03:41:47 -0500","from us-smtp-2.mimecast.com ([207.211.31.81]:34461 \"EHLO\n\tus-smtp-delivery-1.mimecast.com\" rhost-flags-OK-OK-OK-FAIL)\n\tby vger.kernel.org with ESMTP id S1727082AbgBSIlo (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Wed, 19 Feb 2020 03:41:44 -0500","from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com\n\t[209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id\n\tus-mta-413-jyzXFO6gMM2J_yhswVGXNQ-1; Wed, 19 Feb 2020 03:41:39 -0500","from smtp.corp.redhat.com\n\t(int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby mimecast-mx01.redhat.com (Postfix) with ESMTPS id ADF9710CE78E;\n\tWed, 19 Feb 2020 08:41:37 +0000 (UTC)","from krava (unknown [10.43.17.9])\n\tby smtp.corp.redhat.com (Postfix) with ESMTPS id CB6885D9E2;\n\tWed, 19 Feb 2020 08:41:34 +0000 (UTC)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;\n\ts=mimecast20190719; t=1582101703;\n\th=from:from:reply-to:subject:subject:date:date:message-id:message-id:\n\tto:to:cc:cc:mime-version:mime-version:content-type:content-type:\n\tin-reply-to:in-reply-to:references:references;\n\tbh=8elBKO8bz2MbquCOlkktOPvafk4foe0zxZbMQ9OMdfM=;\n\tb=g6P1ThNTdPGVi9lHGL+9mYGZ8kHDNFg8OgRVw7N+pubE+4WUcyugbxwI62MgKdPPMa8bJb\n\tmFFaCaC75RIW9xZ8eQPjdGmTFcJvlsDQxR06P95TSzUOe1xkmpe2sZ49VWVpidH14bZRCm\n\tZbksTayWi8S1Kb7Bn+xXpnzCQRwu8Yk=","X-MC-Unique":"jyzXFO6gMM2J_yhswVGXNQ-1","Date":"Wed, 19 Feb 2020 09:41:32 +0100","From":"Jiri Olsa <jolsa@redhat.com>","To":"Daniel Borkmann <daniel@iogearbox.net>","Cc":"Jiri Olsa <jolsa@kernel.org>, Alexei Starovoitov <ast@kernel.org>,\n\tnetdev@vger.kernel.org, bpf@vger.kernel.org, Andrii Nakryiko\n\t<andriin@fb.com>, Yonghong Song <yhs@fb.com>, Song Liu\n\t<songliubraving@fb.com>,         Martin KaFai Lau <kafai@fb.com>,\n\tJakub Kicinski <kuba@kernel.org>, David Miller <davem@redhat.com>,\n\t=?iso-8859-1?q?Bj=F6rn_T=F6pel?= <bjorn.topel@intel.com>,\n\tJohn Fastabend <john.fastabend@gmail.com>, Jesper Dangaard Brouer\n\t<hawk@kernel.org>,         Arnaldo Carvalho de Melo <acme@redhat.com>","Subject":"Re: [PATCH 06/18] bpf: Add bpf_ksym_tree tree","Message-ID":"<20200219084132.GC439238@krava>","References":"<20200216193005.144157-1-jolsa@kernel.org>\n\t<20200216193005.144157-7-jolsa@kernel.org>\n\t<e869424c-eaf5-d8b1-dfde-86958f437538@iogearbox.net>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<e869424c-eaf5-d8b1-dfde-86958f437538@iogearbox.net>","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.14","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}}]