mbox series

[v2,bpf-next,00/11] libbpf: split BTF support

Message ID 20201105043402.2530976-1-andrii@kernel.org
Headers show
Series libbpf: split BTF support | expand

Message

Andrii Nakryiko Nov. 5, 2020, 4:33 a.m. UTC
This patch set adds support for generating and deduplicating split BTF. This
is an enhancement to the BTF, which allows to designate one BTF as the "base
BTF" (e.g., vmlinux BTF), and one or more other BTFs as "split BTF" (e.g.,
kernel module BTF), which are building upon and extending base BTF with extra
types and strings.

Once loaded, split BTF appears as a single unified BTF superset of base BTF,
with continuous and transparent numbering scheme. This allows all the existing
users of BTF to work correctly and stay agnostic to the base/split BTFs
composition.  The only difference is in how to instantiate split BTF: it
requires base BTF to be alread instantiated and passed to btf__new_xxx_split()
or btf__parse_xxx_split() "constructors" explicitly.

This split approach is necessary if we are to have a reasonably-sized kernel
module BTFs. By deduping each kernel module's BTF individually, resulting
module BTFs contain copies of a lot of kernel types that are already present
in vmlinux BTF. Even those single copies result in a big BTF size bloat. On my
kernel configuration with 700 modules built, non-split BTF approach results in
115MBs of BTFs across all modules. With split BTF deduplication approach,
total size is down to 5.2MBs total, which is on part with vmlinux BTF (at
around 4MBs). This seems reasonable and practical. As to why we'd need kernel
module BTFs, that should be pretty obvious to anyone using BPF at this point,
as it allows all the BTF-powered features to be used with kernel modules:
tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.

This patch set is a pre-requisite to adding split BTF support to pahole, which
is a prerequisite to integrating split BTF into the Linux kernel build setup
to generate BTF for kernel modules. The latter will come as a follow-up patch
series once this series makes it to the libbpf and pahole makes use of it.

Patch #4 introduces necessary basic support for split BTF into libbpf APIs.
Patch #8 implements minimal changes to BTF dedup algorithm to allow
deduplicating split BTFs. Patch #11 adds extra -B flag to bpftool to allow to
specify the path to base BTF for cases when one wants to dump or inspect split
BTF. All the rest are refactorings, clean ups, bug fixes and selftests.

v1->v2:
  - addressed Song's feedback.


Andrii Nakryiko (11):
  libbpf: factor out common operations in BTF writing APIs
  selftest/bpf: relax btf_dedup test checks
  libbpf: unify and speed up BTF string deduplication
  libbpf: implement basic split BTF support
  selftests/bpf: add split BTF basic test
  selftests/bpf: add checking of raw type dump in BTF writer APIs
    selftests
  libbpf: fix BTF data layout checks and allow empty BTF
  libbpf: support BTF dedup of split BTFs
  libbpf: accomodate DWARF/compiler bug with duplicated identical arrays
  selftests/bpf: add split BTF dedup selftests
  tools/bpftool: add bpftool support for split BTF

 tools/bpf/bpftool/btf.c                       |   9 +-
 tools/bpf/bpftool/main.c                      |  15 +-
 tools/bpf/bpftool/main.h                      |   1 +
 tools/lib/bpf/btf.c                           | 807 ++++++++++--------
 tools/lib/bpf/btf.h                           |   8 +
 tools/lib/bpf/libbpf.map                      |   9 +
 tools/testing/selftests/bpf/Makefile          |   2 +-
 tools/testing/selftests/bpf/btf_helpers.c     | 259 ++++++
 tools/testing/selftests/bpf/btf_helpers.h     |  19 +
 tools/testing/selftests/bpf/prog_tests/btf.c  |  40 +-
 .../bpf/prog_tests/btf_dedup_split.c          | 325 +++++++
 .../selftests/bpf/prog_tests/btf_split.c      |  99 +++
 .../selftests/bpf/prog_tests/btf_write.c      |  43 +
 tools/testing/selftests/bpf/test_progs.h      |  11 +
 14 files changed, 1292 insertions(+), 355 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/btf_helpers.c
 create mode 100644 tools/testing/selftests/bpf/btf_helpers.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_split.c

Comments

Jesper Dangaard Brouer Nov. 5, 2020, 9:52 a.m. UTC | #1
On Wed, 4 Nov 2020 20:33:50 -0800
Andrii Nakryiko <andrii@kernel.org> wrote:

> This patch set adds support for generating and deduplicating split BTF. This
> is an enhancement to the BTF, which allows to designate one BTF as the "base
> BTF" (e.g., vmlinux BTF), and one or more other BTFs as "split BTF" (e.g.,
> kernel module BTF), which are building upon and extending base BTF with extra
> types and strings.
> 
> Once loaded, split BTF appears as a single unified BTF superset of base BTF,
> with continuous and transparent numbering scheme. This allows all the existing
> users of BTF to work correctly and stay agnostic to the base/split BTFs
> composition.  The only difference is in how to instantiate split BTF: it
> requires base BTF to be alread instantiated and passed to btf__new_xxx_split()
> or btf__parse_xxx_split() "constructors" explicitly.
> 
> This split approach is necessary if we are to have a reasonably-sized kernel
> module BTFs. By deduping each kernel module's BTF individually, resulting
> module BTFs contain copies of a lot of kernel types that are already present
> in vmlinux BTF. Even those single copies result in a big BTF size bloat. On my
> kernel configuration with 700 modules built, non-split BTF approach results in
> 115MBs of BTFs across all modules. With split BTF deduplication approach,
> total size is down to 5.2MBs total, which is on part with vmlinux BTF (at
> around 4MBs). This seems reasonable and practical. As to why we'd need kernel
> module BTFs, that should be pretty obvious to anyone using BPF at this point,
> as it allows all the BTF-powered features to be used with kernel modules:
> tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.

I love to see this work going forward.

My/Our (+Saeed +Ahern) use-case is for NIC-driver kernel modules.  I
want drivers to define a BTF struct that describe a meta-data area that
can be consumed/used by XDP, also available during xdp_frame to SKB
transition, which happens in net-core. So, I hope BTF-IDs are also
"available" from core kernel code?

 
> This patch set is a pre-requisite to adding split BTF support to pahole, which
> is a prerequisite to integrating split BTF into the Linux kernel build setup
> to generate BTF for kernel modules. The latter will come as a follow-up patch
> series once this series makes it to the libbpf and pahole makes use of it.
> 
> Patch #4 introduces necessary basic support for split BTF into libbpf APIs.
> Patch #8 implements minimal changes to BTF dedup algorithm to allow
> deduplicating split BTFs. Patch #11 adds extra -B flag to bpftool to allow to
> specify the path to base BTF for cases when one wants to dump or inspect split
> BTF. All the rest are refactorings, clean ups, bug fixes and selftests.
> 
> v1->v2:
>   - addressed Song's feedback.
Andrii Nakryiko Nov. 5, 2020, 7:16 p.m. UTC | #2
On Thu, Nov 5, 2020 at 1:53 AM Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>
> On Wed, 4 Nov 2020 20:33:50 -0800
> Andrii Nakryiko <andrii@kernel.org> wrote:
>
> > This patch set adds support for generating and deduplicating split BTF. This
> > is an enhancement to the BTF, which allows to designate one BTF as the "base
> > BTF" (e.g., vmlinux BTF), and one or more other BTFs as "split BTF" (e.g.,
> > kernel module BTF), which are building upon and extending base BTF with extra
> > types and strings.
> >
> > Once loaded, split BTF appears as a single unified BTF superset of base BTF,
> > with continuous and transparent numbering scheme. This allows all the existing
> > users of BTF to work correctly and stay agnostic to the base/split BTFs
> > composition.  The only difference is in how to instantiate split BTF: it
> > requires base BTF to be alread instantiated and passed to btf__new_xxx_split()
> > or btf__parse_xxx_split() "constructors" explicitly.
> >
> > This split approach is necessary if we are to have a reasonably-sized kernel
> > module BTFs. By deduping each kernel module's BTF individually, resulting
> > module BTFs contain copies of a lot of kernel types that are already present
> > in vmlinux BTF. Even those single copies result in a big BTF size bloat. On my
> > kernel configuration with 700 modules built, non-split BTF approach results in
> > 115MBs of BTFs across all modules. With split BTF deduplication approach,
> > total size is down to 5.2MBs total, which is on part with vmlinux BTF (at
> > around 4MBs). This seems reasonable and practical. As to why we'd need kernel
> > module BTFs, that should be pretty obvious to anyone using BPF at this point,
> > as it allows all the BTF-powered features to be used with kernel modules:
> > tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.
>
> I love to see this work going forward.
>

Thanks.

> My/Our (+Saeed +Ahern) use-case is for NIC-driver kernel modules.  I
> want drivers to define a BTF struct that describe a meta-data area that
> can be consumed/used by XDP, also available during xdp_frame to SKB
> transition, which happens in net-core. So, I hope BTF-IDs are also
> "available" from core kernel code?

I'll probably need a more specific example to understand what exactly
you are asking and how you see everything working together, sorry.

If you are asking about support for using BTF_ID_LIST() macro in a
kernel module, then right now we don't call resolve_btfids on modules,
so it's not supported there yet. It's trivial to add, but we'll
probably need to teach resolve_btfids to understand split BTF. We can
do that separately after the basic "infra" lands, though.

>
>
> > This patch set is a pre-requisite to adding split BTF support to pahole, which
> > is a prerequisite to integrating split BTF into the Linux kernel build setup
> > to generate BTF for kernel modules. The latter will come as a follow-up patch
> > series once this series makes it to the libbpf and pahole makes use of it.
> >
> > Patch #4 introduces necessary basic support for split BTF into libbpf APIs.
> > Patch #8 implements minimal changes to BTF dedup algorithm to allow
> > deduplicating split BTFs. Patch #11 adds extra -B flag to bpftool to allow to
> > specify the path to base BTF for cases when one wants to dump or inspect split
> > BTF. All the rest are refactorings, clean ups, bug fixes and selftests.
> >
> > v1->v2:
> >   - addressed Song's feedback.
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
>
Saeed Mahameed Nov. 5, 2020, 7:38 p.m. UTC | #3
On Thu, 2020-11-05 at 11:16 -0800, Andrii Nakryiko wrote:
> > > This split approach is necessary if we are to have a reasonably-
> > > sized kernel
> > > module BTFs. By deduping each kernel module's BTF individually,
> > > resulting
> > > module BTFs contain copies of a lot of kernel types that are
> > > already present
> > > in vmlinux BTF. Even those single copies result in a big BTF size
> > > bloat. On my
> > > kernel configuration with 700 modules built, non-split BTF
> > > approach results in
> > > 115MBs of BTFs across all modules. With split BTF deduplication
> > > approach,
> > > total size is down to 5.2MBs total, which is on part with vmlinux
> > > BTF (at
> > > around 4MBs). This seems reasonable and practical. As to why we'd
> > > need kernel
> > > module BTFs, that should be pretty obvious to anyone using BPF at
> > > this point,
> > > as it allows all the BTF-powered features to be used with kernel
> > > modules:
> > > tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.
> > I love to see this work going forward.
> 
> 
> Thanks.
> 
> 
> 
> > My/Our (+Saeed +Ahern) use-case is for NIC-driver kernel modules. 
> > I
> > want drivers to define a BTF struct that describe a meta-data area
> > that
> > can be consumed/used by XDP, also available during xdp_frame to SKB
> > transition, which happens in net-core. So, I hope BTF-IDs are also
> > "available" from core kernel code?
> 
> 
> I'll probably need a more specific example to understand what exactly
> 
> you are asking and how you see everything working together, sorry.
> 
> 

BTF-IDs can be made available for kernel/drivers, I've wrote a small
patch for this a while ago.

https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp_metadata3&id=6c1cb83629226889d6fadd3ba694e827fca3e247

So the basic use case is that :
1- driver kernel/registers a BTF format (one or more).
2- Userland queries driver's registered BTF to be able understand the
kernel/driver buffers format.

driver example of using this infrastructure:
https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp_metadata3&id=9c24657d6cb3a7852c2e948dc9782f3f39b60104

User Queries driver's XDP metadata BTF format:
https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp_metadata3&id=6a117e2d9196f58de7cf067741e84ec242af27f6

Dump it as C header style 
https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp_metadata3&id=
8bd99626879bff28379707ac3a2c3bb94fd5b410

And then use it in your XDP program to parse packets meta data passed
from this specific driver. ( i mean no real parsing is required, you
just point to the meta data buffer with the metadata btf formatted C
strucuter).


> 
> If you are asking about support for using BTF_ID_LIST() macro in a
> 
> kernel module, then right now we don't call resolve_btfids on
> modules,
> 
> so it's not supported there yet. It's trivial to add, but we'll
> 
> probably need to teach resolve_btfids to understand split BTF. We can
> 
> do that separately after the basic "infra" lands, though.
Andrii Nakryiko Nov. 5, 2020, 8:02 p.m. UTC | #4
On Thu, Nov 5, 2020 at 11:38 AM Saeed Mahameed <saeedm@nvidia.com> wrote:
>
> On Thu, 2020-11-05 at 11:16 -0800, Andrii Nakryiko wrote:
> > > > This split approach is necessary if we are to have a reasonably-
> > > > sized kernel
> > > > module BTFs. By deduping each kernel module's BTF individually,
> > > > resulting
> > > > module BTFs contain copies of a lot of kernel types that are
> > > > already present
> > > > in vmlinux BTF. Even those single copies result in a big BTF size
> > > > bloat. On my
> > > > kernel configuration with 700 modules built, non-split BTF
> > > > approach results in
> > > > 115MBs of BTFs across all modules. With split BTF deduplication
> > > > approach,
> > > > total size is down to 5.2MBs total, which is on part with vmlinux
> > > > BTF (at
> > > > around 4MBs). This seems reasonable and practical. As to why we'd
> > > > need kernel
> > > > module BTFs, that should be pretty obvious to anyone using BPF at
> > > > this point,
> > > > as it allows all the BTF-powered features to be used with kernel
> > > > modules:
> > > > tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.
> > > I love to see this work going forward.
> >
> >
> > Thanks.
> >
> >
> >
> > > My/Our (+Saeed +Ahern) use-case is for NIC-driver kernel modules.
> > > I
> > > want drivers to define a BTF struct that describe a meta-data area
> > > that
> > > can be consumed/used by XDP, also available during xdp_frame to SKB
> > > transition, which happens in net-core. So, I hope BTF-IDs are also
> > > "available" from core kernel code?
> >
> >
> > I'll probably need a more specific example to understand what exactly
> >
> > you are asking and how you see everything working together, sorry.
> >
> >
>
> BTF-IDs can be made available for kernel/drivers, I've wrote a small
> patch for this a while ago.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp_metadata3&id=6c1cb83629226889d6fadd3ba694e827fca3e247
>
> So the basic use case is that :
> 1- driver kernel/registers a BTF format (one or more).

This is now not needed, it happens automatically for module BTF.

> 2- Userland queries driver's registered BTF to be able understand the
> kernel/driver buffers format.

Here the module might need to know its BTF's ID, in addition to BTF
type ID. Or maybe it doesn't. User-space tools can just access BTF
from /sys/kernel/btf/module_name and use provided BTF type ID to dump
whatever is necessary.

>
> driver example of using this infrastructure:
> https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp_metadata3&id=9c24657d6cb3a7852c2e948dc9782f3f39b60104

This, thankfully, won't be needed, you'll just have a normal C struct
and it will be just present in module's BTF.

>
> User Queries driver's XDP metadata BTF format:
> https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp_metadata3&id=6a117e2d9196f58de7cf067741e84ec242af27f6

For this we need support for BTF_ID macro for modules. As I said, it's
pretty easy to add, but feel free to contribute this once the basic
infra lands.

>
> Dump it as C header style
> https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp_metadata3&id=
> 8bd99626879bff28379707ac3a2c3bb94fd5b410

This is available as libbpf-provided API now (see btf_dump APIs). And
bpftool has support to dump all BTF types as C definitions as well.
You might want to do something a bit more targeted, but that's
details.

>
> And then use it in your XDP program to parse packets meta data passed
> from this specific driver. ( i mean no real parsing is required, you
> just point to the meta data buffer with the metadata btf formatted C
> strucuter).
>
>
> >
> > If you are asking about support for using BTF_ID_LIST() macro in a
> >
> > kernel module, then right now we don't call resolve_btfids on
> > modules,
> >
> > so it's not supported there yet. It's trivial to add, but we'll
> >
> > probably need to teach resolve_btfids to understand split BTF. We can
> >
> > do that separately after the basic "infra" lands, though.
>
>
patchwork-bot+netdevbpf@kernel.org Nov. 6, 2020, 2:50 a.m. UTC | #5
Hello:

This series was applied to bpf/bpf-next.git (refs/heads/master):

On Wed, 4 Nov 2020 20:33:50 -0800 you wrote:
> This patch set adds support for generating and deduplicating split BTF. This
> is an enhancement to the BTF, which allows to designate one BTF as the "base
> BTF" (e.g., vmlinux BTF), and one or more other BTFs as "split BTF" (e.g.,
> kernel module BTF), which are building upon and extending base BTF with extra
> types and strings.
> 
> Once loaded, split BTF appears as a single unified BTF superset of base BTF,
> with continuous and transparent numbering scheme. This allows all the existing
> users of BTF to work correctly and stay agnostic to the base/split BTFs
> composition.  The only difference is in how to instantiate split BTF: it
> requires base BTF to be alread instantiated and passed to btf__new_xxx_split()
> or btf__parse_xxx_split() "constructors" explicitly.
> 
> [...]

Here is the summary with links:
  - [v2,bpf-next,01/11] libbpf: factor out common operations in BTF writing APIs
    https://git.kernel.org/bpf/bpf-next/c/c81ed6d81e05
  - [v2,bpf-next,02/11] selftest/bpf: relax btf_dedup test checks
    https://git.kernel.org/bpf/bpf-next/c/d9448f94962b
  - [v2,bpf-next,03/11] libbpf: unify and speed up BTF string deduplication
    https://git.kernel.org/bpf/bpf-next/c/88a82c2a9ab5
  - [v2,bpf-next,04/11] libbpf: implement basic split BTF support
    https://git.kernel.org/bpf/bpf-next/c/ba451366bf44
  - [v2,bpf-next,05/11] selftests/bpf: add split BTF basic test
    https://git.kernel.org/bpf/bpf-next/c/197389da2fbf
  - [v2,bpf-next,06/11] selftests/bpf: add checking of raw type dump in BTF writer APIs selftests
    https://git.kernel.org/bpf/bpf-next/c/1306c980cf89
  - [v2,bpf-next,07/11] libbpf: fix BTF data layout checks and allow empty BTF
    https://git.kernel.org/bpf/bpf-next/c/d8123624506c
  - [v2,bpf-next,08/11] libbpf: support BTF dedup of split BTFs
    https://git.kernel.org/bpf/bpf-next/c/f86524efcf9e
  - [v2,bpf-next,09/11] libbpf: accomodate DWARF/compiler bug with duplicated identical arrays
    https://git.kernel.org/bpf/bpf-next/c/6b6e6b1d09aa
  - [v2,bpf-next,10/11] selftests/bpf: add split BTF dedup selftests
    https://git.kernel.org/bpf/bpf-next/c/232338fa2fb4
  - [v2,bpf-next,11/11] tools/bpftool: add bpftool support for split BTF
    https://git.kernel.org/bpf/bpf-next/c/75fa1777694c

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html