Message ID | 20190820114706.18546-1-toke@redhat.com |
---|---|
Headers | show |
Series | Convert iproute2 to use libbpf (WIP) | expand |
On Tue, Aug 20, 2019 at 01:47:01PM +0200, Toke Høiland-Jørgensen wrote: > iproute2 uses its own bpf loader to load eBPF programs, which has > evolved separately from libbpf. Since we are now standardising on > libbpf, this becomes a problem as iproute2 is slowly accumulating > feature incompatibilities with libbpf-based loaders. In particular, > iproute2 has its own (expanded) version of the map definition struct, > which makes it difficult to write programs that can be loaded with both > custom loaders and iproute2. > > This series seeks to address this by converting iproute2 to using libbpf > for all its bpf needs. This version is an early proof-of-concept RFC, to > get some feedback on whether people think this is the right direction. > > What this series does is the following: > > - Updates the libbpf map definition struct to match that of iproute2 > (patch 1). > - Adds functionality to libbpf to support automatic pinning of maps when > loading an eBPF program, while re-using pinned maps if they already > exist (patches 2-3). > - Modifies iproute2 to make it possible to compile it against libbpf > without affecting any existing functionality (patch 4). > - Changes the iproute2 eBPF loader to use libbpf for loading XDP > programs (patch 5). > > > As this is an early PoC, there are still a few missing pieces before > this can be merged. Including (but probably not limited to): > > - Consolidate the map definition struct in the bpf_helpers.h file in the > kernel tree. This contains a different, and incompatible, update to > the struct. Since the iproute2 version has actually been released for > use outside the kernel tree (and thus is subject to API stability > constraints), I think it makes the most sense to keep that, and port > the selftests to use it. It sounds like you're implying that existing libbpf format is not uapi. It is and we cannot break it. If patch 1 means breakage for existing pre-compiled .o that won't load with new libbpf then we cannot use this method. Recompiling .o with new libbpf definition of bpf_map_def isn't an option. libbpf has to be smart before/after and recognize both old and iproute2 format.
On Tue, Aug 20, 2019 at 4:47 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > iproute2 uses its own bpf loader to load eBPF programs, which has > evolved separately from libbpf. Since we are now standardising on > libbpf, this becomes a problem as iproute2 is slowly accumulating > feature incompatibilities with libbpf-based loaders. In particular, > iproute2 has its own (expanded) version of the map definition struct, > which makes it difficult to write programs that can be loaded with both > custom loaders and iproute2. > > This series seeks to address this by converting iproute2 to using libbpf > for all its bpf needs. This version is an early proof-of-concept RFC, to > get some feedback on whether people think this is the right direction. > > What this series does is the following: > > - Updates the libbpf map definition struct to match that of iproute2 > (patch 1). Hi Toke, Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm totally in support of making iproute2 use libbpf to load/initialize BPF programs. But I'm against adding iproute2-specific fields to libbpf's bpf_map_def definitions to support this. I've proposed the plan of extending libbpf's supported features so that it can be used to load iproute2-style BPF programs earlier, please see discussions in [0] and [1]. I think instead of emulating iproute2 way of matching everything based on user-specified internal IDs, which doesn't provide good user experience and is quite easy to get wrong, we should support same scenarios with better declarative syntax and in a less error-prone way. I believe we can do that by relying on BTF more heavily (again, please check some of my proposals in [0], [1], and discussion with Daniel in those threads). It will feel more natural and be more straightforward to follow. It would be great if you can lend a hand in implementing pieces of that plan! I'm currently on vacation, so my availability is very sparse, but I'd be happy to discuss this further, if need be. [0] https://lore.kernel.org/bpf/CAEf4BzbfdG2ub7gCi0OYqBrUoChVHWsmOntWAkJt47=FE+km+A@xxxxxxxxxxxxxx/ [1] https://www.spinics.net/lists/bpf/msg03976.html > - Adds functionality to libbpf to support automatic pinning of maps when > loading an eBPF program, while re-using pinned maps if they already > exist (patches 2-3). > - Modifies iproute2 to make it possible to compile it against libbpf > without affecting any existing functionality (patch 4). > - Changes the iproute2 eBPF loader to use libbpf for loading XDP > programs (patch 5). > > > As this is an early PoC, there are still a few missing pieces before > this can be merged. Including (but probably not limited to): > > - Consolidate the map definition struct in the bpf_helpers.h file in the > kernel tree. This contains a different, and incompatible, update to > the struct. Since the iproute2 version has actually been released for > use outside the kernel tree (and thus is subject to API stability > constraints), I think it makes the most sense to keep that, and port > the selftests to use it. > > - The iproute2 loader supports automatically populating map-in-map > definitions on load. This needs to be added to libbpf as well. There > is an implementation of this in the selftests in the kernel tree, > which will have to be ported (related to the previous point). > > - The iproute2 port needs to be completed; this means at least > supporting TC eBPF programs as well, figuring out how to deal with > cBPF programs, and getting the verbose output back to the same state > as before the port. Also, I guess the iproute2 maintainers need to ACK > that they are good with adding a dependency on libbpf. > > - Some of the code added to libbpf in patch 2 in this series include > code derived from iproute2, which is GPLv2+. So it will need to be > re-licensed to be usable in libbpf. Since `git blame` indicated that > the original code was written by Daniel, I figure he can ACK that > relicensing before applying the patches :) > > > Please take a look at this series and let me know if you agree > that this is the right direction to go. Assuming there's consensus that > it is, I'll focus on getting the rest of the libbpf patches ready for > merging. I'll send those as a separate series, and hold off on the > iproute2 patches until they are merged; but for this version I'm > including both in one series so it's easier to see the context. > > -Toke > > > libbpf: > Toke Høiland-Jørgensen (3): > libbpf: Add map definition struct fields from iproute2 > libbpf: Add support for auto-pinning of maps with reuse on program > load > libbpf: Add support for specifying map pinning path via callback > > tools/lib/bpf/libbpf.c | 205 +++++++++++++++++++++++++++++++++++++++-- > tools/lib/bpf/libbpf.h | 18 ++++ > 2 files changed, 214 insertions(+), 9 deletions(-) > > iproute2: > Toke Høiland-Jørgensen (2): > iproute2: Allow compiling against libbpf > iproute2: Support loading XDP programs with libbpf > > configure | 16 +++++ > include/bpf_util.h | 6 +- > ip/ipvrf.c | 4 +- > lib/bpf.c | 148 ++++++++++++++++++++++++++++++++++++--------- > 4 files changed, 142 insertions(+), 32 deletions(-) > > > -- > 2.22.1 >
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > On Tue, Aug 20, 2019 at 01:47:01PM +0200, Toke Høiland-Jørgensen wrote: >> iproute2 uses its own bpf loader to load eBPF programs, which has >> evolved separately from libbpf. Since we are now standardising on >> libbpf, this becomes a problem as iproute2 is slowly accumulating >> feature incompatibilities with libbpf-based loaders. In particular, >> iproute2 has its own (expanded) version of the map definition struct, >> which makes it difficult to write programs that can be loaded with both >> custom loaders and iproute2. >> >> This series seeks to address this by converting iproute2 to using libbpf >> for all its bpf needs. This version is an early proof-of-concept RFC, to >> get some feedback on whether people think this is the right direction. >> >> What this series does is the following: >> >> - Updates the libbpf map definition struct to match that of iproute2 >> (patch 1). >> - Adds functionality to libbpf to support automatic pinning of maps when >> loading an eBPF program, while re-using pinned maps if they already >> exist (patches 2-3). >> - Modifies iproute2 to make it possible to compile it against libbpf >> without affecting any existing functionality (patch 4). >> - Changes the iproute2 eBPF loader to use libbpf for loading XDP >> programs (patch 5). >> >> >> As this is an early PoC, there are still a few missing pieces before >> this can be merged. Including (but probably not limited to): >> >> - Consolidate the map definition struct in the bpf_helpers.h file in the >> kernel tree. This contains a different, and incompatible, update to >> the struct. Since the iproute2 version has actually been released for >> use outside the kernel tree (and thus is subject to API stability >> constraints), I think it makes the most sense to keep that, and port >> the selftests to use it. > > It sounds like you're implying that existing libbpf format is not > uapi. No, that's not what I meant... See below. > It is and we cannot break it. > If patch 1 means breakage for existing pre-compiled .o that won't load > with new libbpf then we cannot use this method. > Recompiling .o with new libbpf definition of bpf_map_def isn't an option. > libbpf has to be smart before/after and recognize both old and iproute2 format. The libbpf.h definition of struct bpf_map_def is compatible with the one used in iproute2. In libbpf.h, the struct only contains five fields (type, key_size, value_size, max_entries and flags), and iproute2 adds another 4 (id, pinning, inner_id and inner_idx; these are the ones in patch 1 in this series). The issue I was alluding to above is that the bpf_helpers.h file in the kernel selftests directory *also* extends the bpf_map_def struct, and adds two *different* fields (inner_map_idx and numa_mode). The former is used to implement the same map-in-map definition functionality that iproute2 has, but with different semantics. The latter is additional to that, and I'm planning to add that to this series. Since bpf_helpers.h is *not* part of libbpf (yet), this will make it possible to keep API (and ABI) compatibility with both iproute2 and libbpf. As in, old .o files will still load with libbpf after this series, they just won't be able to use the new automatic pinning feature. -Toke
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Tue, Aug 20, 2019 at 4:47 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> iproute2 uses its own bpf loader to load eBPF programs, which has >> evolved separately from libbpf. Since we are now standardising on >> libbpf, this becomes a problem as iproute2 is slowly accumulating >> feature incompatibilities with libbpf-based loaders. In particular, >> iproute2 has its own (expanded) version of the map definition struct, >> which makes it difficult to write programs that can be loaded with both >> custom loaders and iproute2. >> >> This series seeks to address this by converting iproute2 to using libbpf >> for all its bpf needs. This version is an early proof-of-concept RFC, to >> get some feedback on whether people think this is the right direction. >> >> What this series does is the following: >> >> - Updates the libbpf map definition struct to match that of iproute2 >> (patch 1). > > > Hi Toke, > > Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm > totally in support of making iproute2 use libbpf to load/initialize > BPF programs. But I'm against adding iproute2-specific fields to > libbpf's bpf_map_def definitions to support this. > > I've proposed the plan of extending libbpf's supported features so > that it can be used to load iproute2-style BPF programs earlier, > please see discussions in [0] and [1]. Yeah, I've seen that discussion, and agree that longer term this is probably a better way to do map-in-map definitions. However, I view your proposal as complementary to this series: we'll probably also want the BTF-based definition to work with iproute2, and that means iproute2 needs to be ported to libbpf. But iproute2 needs to be backwards compatible with the format it supports now, and, well, this series is the simplest way to achieve that IMO :) > I think instead of emulating iproute2 way of matching everything based > on user-specified internal IDs, which doesn't provide good user > experience and is quite easy to get wrong, we should support same > scenarios with better declarative syntax and in a less error-prone > way. I believe we can do that by relying on BTF more heavily (again, > please check some of my proposals in [0], [1], and discussion with > Daniel in those threads). It will feel more natural and be more > straightforward to follow. It would be great if you can lend a hand in > implementing pieces of that plan! > > I'm currently on vacation, so my availability is very sparse, but I'd > be happy to discuss this further, if need be. Happy to collaborate on your proposal when you're back from vacation; but as I said above, I believe this is a complementary longer-term thing... -Toke
On Wed, Aug 21, 2019 at 2:07 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > On Tue, Aug 20, 2019 at 4:47 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > >> iproute2 uses its own bpf loader to load eBPF programs, which has > >> evolved separately from libbpf. Since we are now standardising on > >> libbpf, this becomes a problem as iproute2 is slowly accumulating > >> feature incompatibilities with libbpf-based loaders. In particular, > >> iproute2 has its own (expanded) version of the map definition struct, > >> which makes it difficult to write programs that can be loaded with both > >> custom loaders and iproute2. > >> > >> This series seeks to address this by converting iproute2 to using libbpf > >> for all its bpf needs. This version is an early proof-of-concept RFC, to > >> get some feedback on whether people think this is the right direction. > >> > >> What this series does is the following: > >> > >> - Updates the libbpf map definition struct to match that of iproute2 > >> (patch 1). > > > > > > Hi Toke, > > > > Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm > > totally in support of making iproute2 use libbpf to load/initialize > > BPF programs. But I'm against adding iproute2-specific fields to > > libbpf's bpf_map_def definitions to support this. > > > > I've proposed the plan of extending libbpf's supported features so > > that it can be used to load iproute2-style BPF programs earlier, > > please see discussions in [0] and [1]. > > Yeah, I've seen that discussion, and agree that longer term this is > probably a better way to do map-in-map definitions. > > However, I view your proposal as complementary to this series: we'll > probably also want the BTF-based definition to work with iproute2, and > that means iproute2 needs to be ported to libbpf. But iproute2 needs to > be backwards compatible with the format it supports now, and, well, this > series is the simplest way to achieve that IMO :) Ok, I understand that. But I'd still want to avoid adding extra cruft to libbpf just for backwards-compatibility with *exact* iproute2 format. Libbpf as a whole is trying to move away from relying on binary bpf_map_def and into using BTF-defined map definitions, and this patch series is a step backwards in that regard, that adds, essentially, already outdated stuff that we'll need to support forever (I mean those extra fields in bpf_map_def, that will stay there forever). We've discussed one way to deal with it, IMO, in a cleaner way. It can be done in few steps: 1. I originally wanted BTF-defined map definitions to ignore unknown fields. It shouldn't be a default mode, but it should be supported (and of course is very easy to add). So let's add that and let libbpf ignore unknown stuff. 2. Then to let iproute2 loader deal with backwards-compatibility for libbpf-incompatible bpf_elf_map, we need to "pass-through" all those fields so that users of libbpf (iproute2 loader, in this case) can make use of it. The easiest and cleanest way to do this is to expose BTF ID of a type describing each map entry and let iproute2 process that in whichever way it sees fit. Luckily, bpf_elf_map is compatible in `type` field, which will let libbpf recognize bpf_elf_map as map definition. All the rest setup will be done by iproute2, by processing BTF of bpf_elf_map, which will let it set up map sizes, flags and do all of its map-in-map magic. The only additions to libbpf in this case would be a new `__u32 bpf_map__btf_id(struct bpf_map* map);` API. I haven't written any code and haven't 100% checked that this will cover everything, but I think we should try. This will allow to let users of libbpf do custom stuff with map definitions without having to put all this extra logic into libbpf itself, which I think is desirable outcome. > > > I think instead of emulating iproute2 way of matching everything based > > on user-specified internal IDs, which doesn't provide good user > > experience and is quite easy to get wrong, we should support same > > scenarios with better declarative syntax and in a less error-prone > > way. I believe we can do that by relying on BTF more heavily (again, > > please check some of my proposals in [0], [1], and discussion with > > Daniel in those threads). It will feel more natural and be more > > straightforward to follow. It would be great if you can lend a hand in > > implementing pieces of that plan! > > > > I'm currently on vacation, so my availability is very sparse, but I'd > > be happy to discuss this further, if need be. > > Happy to collaborate on your proposal when you're back from vacation; > but as I said above, I believe this is a complementary longer-term > thing... > > -Toke
On Wed, Aug 21, 2019 at 4:29 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > > > On Tue, Aug 20, 2019 at 01:47:01PM +0200, Toke Høiland-Jørgensen wrote: > >> iproute2 uses its own bpf loader to load eBPF programs, which has > >> evolved separately from libbpf. Since we are now standardising on > >> libbpf, this becomes a problem as iproute2 is slowly accumulating > >> feature incompatibilities with libbpf-based loaders. In particular, > >> iproute2 has its own (expanded) version of the map definition struct, > >> which makes it difficult to write programs that can be loaded with both > >> custom loaders and iproute2. > >> > >> This series seeks to address this by converting iproute2 to using libbpf > >> for all its bpf needs. This version is an early proof-of-concept RFC, to > >> get some feedback on whether people think this is the right direction. > >> > >> What this series does is the following: > >> > >> - Updates the libbpf map definition struct to match that of iproute2 > >> (patch 1). > >> - Adds functionality to libbpf to support automatic pinning of maps when > >> loading an eBPF program, while re-using pinned maps if they already > >> exist (patches 2-3). > >> - Modifies iproute2 to make it possible to compile it against libbpf > >> without affecting any existing functionality (patch 4). > >> - Changes the iproute2 eBPF loader to use libbpf for loading XDP > >> programs (patch 5). > >> > >> > >> As this is an early PoC, there are still a few missing pieces before > >> this can be merged. Including (but probably not limited to): > >> > >> - Consolidate the map definition struct in the bpf_helpers.h file in the > >> kernel tree. This contains a different, and incompatible, update to > >> the struct. Since the iproute2 version has actually been released for > >> use outside the kernel tree (and thus is subject to API stability > >> constraints), I think it makes the most sense to keep that, and port > >> the selftests to use it. > > > > It sounds like you're implying that existing libbpf format is not > > uapi. > > No, that's not what I meant... See below. > > > It is and we cannot break it. > > If patch 1 means breakage for existing pre-compiled .o that won't load > > with new libbpf then we cannot use this method. > > Recompiling .o with new libbpf definition of bpf_map_def isn't an option. > > libbpf has to be smart before/after and recognize both old and iproute2 format. > > The libbpf.h definition of struct bpf_map_def is compatible with the one > used in iproute2. In libbpf.h, the struct only contains five fields > (type, key_size, value_size, max_entries and flags), and iproute2 adds > another 4 (id, pinning, inner_id and inner_idx; these are the ones in > patch 1 in this series). > > The issue I was alluding to above is that the bpf_helpers.h file in the > kernel selftests directory *also* extends the bpf_map_def struct, and > adds two *different* fields (inner_map_idx and numa_mode). The former is > used to implement the same map-in-map definition functionality that > iproute2 has, but with different semantics. The latter is additional to > that, and I'm planning to add that to this series. > > Since bpf_helpers.h is *not* part of libbpf (yet), this will make it We should start considering it as if it was, so if we can avoid adding stuff that I'd need to untangle to move it into libbpf, I'd rather avoid it. We've already prepared this move by relicensing bpf_helpers.h. Moving it into libbpf itself is immediate next thing I'll do when I'm back. > possible to keep API (and ABI) compatibility with both iproute2 and > libbpf. As in, old .o files will still load with libbpf after this > series, they just won't be able to use the new automatic pinning > feature. > > -Toke
On 8/22/19 9:49 AM, Andrii Nakryiko wrote: > On Wed, Aug 21, 2019 at 2:07 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >>> On Tue, Aug 20, 2019 at 4:47 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >>>> >>>> iproute2 uses its own bpf loader to load eBPF programs, which has >>>> evolved separately from libbpf. Since we are now standardising on >>>> libbpf, this becomes a problem as iproute2 is slowly accumulating >>>> feature incompatibilities with libbpf-based loaders. In particular, >>>> iproute2 has its own (expanded) version of the map definition struct, >>>> which makes it difficult to write programs that can be loaded with both >>>> custom loaders and iproute2. >>>> >>>> This series seeks to address this by converting iproute2 to using libbpf >>>> for all its bpf needs. This version is an early proof-of-concept RFC, to >>>> get some feedback on whether people think this is the right direction. >>>> >>>> What this series does is the following: >>>> >>>> - Updates the libbpf map definition struct to match that of iproute2 >>>> (patch 1). >>> >>> Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm >>> totally in support of making iproute2 use libbpf to load/initialize >>> BPF programs. But I'm against adding iproute2-specific fields to >>> libbpf's bpf_map_def definitions to support this. >>> >>> I've proposed the plan of extending libbpf's supported features so >>> that it can be used to load iproute2-style BPF programs earlier, >>> please see discussions in [0] and [1]. >> >> Yeah, I've seen that discussion, and agree that longer term this is >> probably a better way to do map-in-map definitions. >> >> However, I view your proposal as complementary to this series: we'll >> probably also want the BTF-based definition to work with iproute2, and >> that means iproute2 needs to be ported to libbpf. But iproute2 needs to >> be backwards compatible with the format it supports now, and, well, this >> series is the simplest way to achieve that IMO :) > > Ok, I understand that. But I'd still want to avoid adding extra cruft > to libbpf just for backwards-compatibility with *exact* iproute2 > format. Libbpf as a whole is trying to move away from relying on > binary bpf_map_def and into using BTF-defined map definitions, and > this patch series is a step backwards in that regard, that adds, > essentially, already outdated stuff that we'll need to support forever > (I mean those extra fields in bpf_map_def, that will stay there > forever). Agree, adding these extensions for libbpf would be a step backwards compared to using BTF defined map defs. > We've discussed one way to deal with it, IMO, in a cleaner way. It can > be done in few steps: > > 1. I originally wanted BTF-defined map definitions to ignore unknown > fields. It shouldn't be a default mode, but it should be supported > (and of course is very easy to add). So let's add that and let libbpf > ignore unknown stuff. > > 2. Then to let iproute2 loader deal with backwards-compatibility for > libbpf-incompatible bpf_elf_map, we need to "pass-through" all those > fields so that users of libbpf (iproute2 loader, in this case) can > make use of it. The easiest and cleanest way to do this is to expose > BTF ID of a type describing each map entry and let iproute2 process > that in whichever way it sees fit. > > Luckily, bpf_elf_map is compatible in `type` field, which will let > libbpf recognize bpf_elf_map as map definition. All the rest setup > will be done by iproute2, by processing BTF of bpf_elf_map, which will > let it set up map sizes, flags and do all of its map-in-map magic. > > The only additions to libbpf in this case would be a new `__u32 > bpf_map__btf_id(struct bpf_map* map);` API. > > I haven't written any code and haven't 100% checked that this will > cover everything, but I think we should try. This will allow to let > users of libbpf do custom stuff with map definitions without having to > put all this extra logic into libbpf itself, which I think is > desirable outcome. Sounds reasonable in general, but all this still has the issue that we're assuming that BTF is /always/ present. Existing object files that would load just fine /today/ but do not have BTF attached won't be handled here. Wouldn't it be more straight forward to allow passing callbacks to the libbpf loader such that if the map section is not found to be bpf_map_def compatible, we rely on external user aka callback to parse the ELF section, handle any non-default libbpf behavior like pinning/retrieving from BPF fs, populate related internal libbpf map data structures and pass control back to libbpf loader afterwards. (Similar callback with prog section name handling for the case where tail call maps get automatically populated.) Thanks, Daniel
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Wed, Aug 21, 2019 at 4:29 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: >> >> > On Tue, Aug 20, 2019 at 01:47:01PM +0200, Toke Høiland-Jørgensen wrote: >> >> iproute2 uses its own bpf loader to load eBPF programs, which has >> >> evolved separately from libbpf. Since we are now standardising on >> >> libbpf, this becomes a problem as iproute2 is slowly accumulating >> >> feature incompatibilities with libbpf-based loaders. In particular, >> >> iproute2 has its own (expanded) version of the map definition struct, >> >> which makes it difficult to write programs that can be loaded with both >> >> custom loaders and iproute2. >> >> >> >> This series seeks to address this by converting iproute2 to using libbpf >> >> for all its bpf needs. This version is an early proof-of-concept RFC, to >> >> get some feedback on whether people think this is the right direction. >> >> >> >> What this series does is the following: >> >> >> >> - Updates the libbpf map definition struct to match that of iproute2 >> >> (patch 1). >> >> - Adds functionality to libbpf to support automatic pinning of maps when >> >> loading an eBPF program, while re-using pinned maps if they already >> >> exist (patches 2-3). >> >> - Modifies iproute2 to make it possible to compile it against libbpf >> >> without affecting any existing functionality (patch 4). >> >> - Changes the iproute2 eBPF loader to use libbpf for loading XDP >> >> programs (patch 5). >> >> >> >> >> >> As this is an early PoC, there are still a few missing pieces before >> >> this can be merged. Including (but probably not limited to): >> >> >> >> - Consolidate the map definition struct in the bpf_helpers.h file in the >> >> kernel tree. This contains a different, and incompatible, update to >> >> the struct. Since the iproute2 version has actually been released for >> >> use outside the kernel tree (and thus is subject to API stability >> >> constraints), I think it makes the most sense to keep that, and port >> >> the selftests to use it. >> > >> > It sounds like you're implying that existing libbpf format is not >> > uapi. >> >> No, that's not what I meant... See below. >> >> > It is and we cannot break it. >> > If patch 1 means breakage for existing pre-compiled .o that won't load >> > with new libbpf then we cannot use this method. >> > Recompiling .o with new libbpf definition of bpf_map_def isn't an option. >> > libbpf has to be smart before/after and recognize both old and iproute2 format. >> >> The libbpf.h definition of struct bpf_map_def is compatible with the one >> used in iproute2. In libbpf.h, the struct only contains five fields >> (type, key_size, value_size, max_entries and flags), and iproute2 adds >> another 4 (id, pinning, inner_id and inner_idx; these are the ones in >> patch 1 in this series). >> >> The issue I was alluding to above is that the bpf_helpers.h file in the >> kernel selftests directory *also* extends the bpf_map_def struct, and >> adds two *different* fields (inner_map_idx and numa_mode). The former is >> used to implement the same map-in-map definition functionality that >> iproute2 has, but with different semantics. The latter is additional to >> that, and I'm planning to add that to this series. >> >> Since bpf_helpers.h is *not* part of libbpf (yet), this will make it > > We should start considering it as if it was, so if we can avoid adding > stuff that I'd need to untangle to move it into libbpf, I'd rather > avoid it. > We've already prepared this move by relicensing bpf_helpers.h. Moving > it into libbpf itself is immediate next thing I'll do when I'm back. Yeah, I figured that with the relicensing, bpf_helpers would probably be making its way into libbpf soon. Which is why I wanted to start this discussion before that: If we do move bpf_helpers as-is, that will put us in the territory of full-on binary incompatibility. So the time to discuss doing this in a compatible way is now, before any such move is made. -Toke
Daniel Borkmann <daniel@iogearbox.net> writes: > On 8/22/19 9:49 AM, Andrii Nakryiko wrote: >> On Wed, Aug 21, 2019 at 2:07 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >>>> On Tue, Aug 20, 2019 at 4:47 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >>>>> >>>>> iproute2 uses its own bpf loader to load eBPF programs, which has >>>>> evolved separately from libbpf. Since we are now standardising on >>>>> libbpf, this becomes a problem as iproute2 is slowly accumulating >>>>> feature incompatibilities with libbpf-based loaders. In particular, >>>>> iproute2 has its own (expanded) version of the map definition struct, >>>>> which makes it difficult to write programs that can be loaded with both >>>>> custom loaders and iproute2. >>>>> >>>>> This series seeks to address this by converting iproute2 to using libbpf >>>>> for all its bpf needs. This version is an early proof-of-concept RFC, to >>>>> get some feedback on whether people think this is the right direction. >>>>> >>>>> What this series does is the following: >>>>> >>>>> - Updates the libbpf map definition struct to match that of iproute2 >>>>> (patch 1). >>>> >>>> Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm >>>> totally in support of making iproute2 use libbpf to load/initialize >>>> BPF programs. But I'm against adding iproute2-specific fields to >>>> libbpf's bpf_map_def definitions to support this. >>>> >>>> I've proposed the plan of extending libbpf's supported features so >>>> that it can be used to load iproute2-style BPF programs earlier, >>>> please see discussions in [0] and [1]. >>> >>> Yeah, I've seen that discussion, and agree that longer term this is >>> probably a better way to do map-in-map definitions. >>> >>> However, I view your proposal as complementary to this series: we'll >>> probably also want the BTF-based definition to work with iproute2, and >>> that means iproute2 needs to be ported to libbpf. But iproute2 needs to >>> be backwards compatible with the format it supports now, and, well, this >>> series is the simplest way to achieve that IMO :) >> >> Ok, I understand that. But I'd still want to avoid adding extra cruft >> to libbpf just for backwards-compatibility with *exact* iproute2 >> format. Libbpf as a whole is trying to move away from relying on >> binary bpf_map_def and into using BTF-defined map definitions, and >> this patch series is a step backwards in that regard, that adds, >> essentially, already outdated stuff that we'll need to support forever >> (I mean those extra fields in bpf_map_def, that will stay there >> forever). > > Agree, adding these extensions for libbpf would be a step backwards > compared to using BTF defined map defs. > >> We've discussed one way to deal with it, IMO, in a cleaner way. It can >> be done in few steps: >> >> 1. I originally wanted BTF-defined map definitions to ignore unknown >> fields. It shouldn't be a default mode, but it should be supported >> (and of course is very easy to add). So let's add that and let libbpf >> ignore unknown stuff. >> >> 2. Then to let iproute2 loader deal with backwards-compatibility for >> libbpf-incompatible bpf_elf_map, we need to "pass-through" all those >> fields so that users of libbpf (iproute2 loader, in this case) can >> make use of it. The easiest and cleanest way to do this is to expose >> BTF ID of a type describing each map entry and let iproute2 process >> that in whichever way it sees fit. >> >> Luckily, bpf_elf_map is compatible in `type` field, which will let >> libbpf recognize bpf_elf_map as map definition. All the rest setup >> will be done by iproute2, by processing BTF of bpf_elf_map, which will >> let it set up map sizes, flags and do all of its map-in-map magic. >> >> The only additions to libbpf in this case would be a new `__u32 >> bpf_map__btf_id(struct bpf_map* map);` API. >> >> I haven't written any code and haven't 100% checked that this will >> cover everything, but I think we should try. This will allow to let >> users of libbpf do custom stuff with map definitions without having to >> put all this extra logic into libbpf itself, which I think is >> desirable outcome. > > Sounds reasonable in general, but all this still has the issue that > we're assuming that BTF is /always/ present. Existing object files > that would load just fine /today/ but do not have BTF attached won't > be handled here. Wouldn't it be more straight forward to allow passing > callbacks to the libbpf loader such that if the map section is not > found to be bpf_map_def compatible, we rely on external user aka > callback to parse the ELF section, handle any non-default libbpf > behavior like pinning/retrieving from BPF fs, populate related > internal libbpf map data structures and pass control back to libbpf > loader afterwards. (Similar callback with prog section name handling > for the case where tail call maps get automatically populated.) Thinking about this some more, I think there are two separate issues here: 1. Do we want libbpf to support the features currently in iproute2 and bpf_helpers (i.e., map pinning + reuse, map-in-map definitions, and NUMA node placement of maps). IMO the answer to this is yes. 2. What should the data format be for BPF programs to signal that they want to use those features? Here, the longer-term answer is BTF-based map definitions, but we still want iproute2 to be backwards compatible. So how about I revise this patch series to implement the *features* (I already implemented map-in-map and numa nodes[0], so that is sorta already done), but instead of extending the bpf_map_def struct, I just expose callbacks that will allow programs to fill in internal-to-libbpf data structures with the required information. Then, once the BTF-based map definition does land, that can simply define default callbacks that uses the BTF information to fill in those same internal data structures? This would mean no extending bpf_map_def, and relaxing the current libbpf restriction on extending bpf_map_def. The drawback of this approach is that it does nothing to combat fragmentation: People building their own loaders can still reimplement different semantics for map defs, leading to programs that are tied to a particular loader. So this would only work if we really believe BTF can save us from this. I don't feel competent to comment on this just yet, but thought I'd mention it :) -Toke
Toke Høiland-Jørgensen <toke@redhat.com> writes: > Daniel Borkmann <daniel@iogearbox.net> writes: > >> On 8/22/19 9:49 AM, Andrii Nakryiko wrote: >>> On Wed, Aug 21, 2019 at 2:07 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >>>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >>>>> On Tue, Aug 20, 2019 at 4:47 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >>>>>> >>>>>> iproute2 uses its own bpf loader to load eBPF programs, which has >>>>>> evolved separately from libbpf. Since we are now standardising on >>>>>> libbpf, this becomes a problem as iproute2 is slowly accumulating >>>>>> feature incompatibilities with libbpf-based loaders. In particular, >>>>>> iproute2 has its own (expanded) version of the map definition struct, >>>>>> which makes it difficult to write programs that can be loaded with both >>>>>> custom loaders and iproute2. >>>>>> >>>>>> This series seeks to address this by converting iproute2 to using libbpf >>>>>> for all its bpf needs. This version is an early proof-of-concept RFC, to >>>>>> get some feedback on whether people think this is the right direction. >>>>>> >>>>>> What this series does is the following: >>>>>> >>>>>> - Updates the libbpf map definition struct to match that of iproute2 >>>>>> (patch 1). >>>>> >>>>> Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm >>>>> totally in support of making iproute2 use libbpf to load/initialize >>>>> BPF programs. But I'm against adding iproute2-specific fields to >>>>> libbpf's bpf_map_def definitions to support this. >>>>> >>>>> I've proposed the plan of extending libbpf's supported features so >>>>> that it can be used to load iproute2-style BPF programs earlier, >>>>> please see discussions in [0] and [1]. >>>> >>>> Yeah, I've seen that discussion, and agree that longer term this is >>>> probably a better way to do map-in-map definitions. >>>> >>>> However, I view your proposal as complementary to this series: we'll >>>> probably also want the BTF-based definition to work with iproute2, and >>>> that means iproute2 needs to be ported to libbpf. But iproute2 needs to >>>> be backwards compatible with the format it supports now, and, well, this >>>> series is the simplest way to achieve that IMO :) >>> >>> Ok, I understand that. But I'd still want to avoid adding extra cruft >>> to libbpf just for backwards-compatibility with *exact* iproute2 >>> format. Libbpf as a whole is trying to move away from relying on >>> binary bpf_map_def and into using BTF-defined map definitions, and >>> this patch series is a step backwards in that regard, that adds, >>> essentially, already outdated stuff that we'll need to support forever >>> (I mean those extra fields in bpf_map_def, that will stay there >>> forever). >> >> Agree, adding these extensions for libbpf would be a step backwards >> compared to using BTF defined map defs. >> >>> We've discussed one way to deal with it, IMO, in a cleaner way. It can >>> be done in few steps: >>> >>> 1. I originally wanted BTF-defined map definitions to ignore unknown >>> fields. It shouldn't be a default mode, but it should be supported >>> (and of course is very easy to add). So let's add that and let libbpf >>> ignore unknown stuff. >>> >>> 2. Then to let iproute2 loader deal with backwards-compatibility for >>> libbpf-incompatible bpf_elf_map, we need to "pass-through" all those >>> fields so that users of libbpf (iproute2 loader, in this case) can >>> make use of it. The easiest and cleanest way to do this is to expose >>> BTF ID of a type describing each map entry and let iproute2 process >>> that in whichever way it sees fit. >>> >>> Luckily, bpf_elf_map is compatible in `type` field, which will let >>> libbpf recognize bpf_elf_map as map definition. All the rest setup >>> will be done by iproute2, by processing BTF of bpf_elf_map, which will >>> let it set up map sizes, flags and do all of its map-in-map magic. >>> >>> The only additions to libbpf in this case would be a new `__u32 >>> bpf_map__btf_id(struct bpf_map* map);` API. >>> >>> I haven't written any code and haven't 100% checked that this will >>> cover everything, but I think we should try. This will allow to let >>> users of libbpf do custom stuff with map definitions without having to >>> put all this extra logic into libbpf itself, which I think is >>> desirable outcome. >> >> Sounds reasonable in general, but all this still has the issue that >> we're assuming that BTF is /always/ present. Existing object files >> that would load just fine /today/ but do not have BTF attached won't >> be handled here. Wouldn't it be more straight forward to allow passing >> callbacks to the libbpf loader such that if the map section is not >> found to be bpf_map_def compatible, we rely on external user aka >> callback to parse the ELF section, handle any non-default libbpf >> behavior like pinning/retrieving from BPF fs, populate related >> internal libbpf map data structures and pass control back to libbpf >> loader afterwards. (Similar callback with prog section name handling >> for the case where tail call maps get automatically populated.) > > Thinking about this some more, I think there are two separate issues > here: > > 1. Do we want libbpf to support the features currently in iproute2 and > bpf_helpers (i.e., map pinning + reuse, map-in-map definitions, and > NUMA node placement of maps). IMO the answer to this is yes. > > 2. What should the data format be for BPF programs to signal that they > want to use those features? Here, the longer-term answer is BTF-based > map definitions, but we still want iproute2 to be backwards > compatible. > > > So how about I revise this patch series to implement the *features* (I > already implemented map-in-map and numa nodes[0], so that is sorta > already done), but instead of extending the bpf_map_def struct, I just > expose callbacks that will allow programs to fill in internal-to-libbpf > data structures with the required information. Then, once the BTF-based > map definition does land, that can simply define default callbacks that > uses the BTF information to fill in those same internal data structures? > > This would mean no extending bpf_map_def, and relaxing the current > libbpf restriction on extending bpf_map_def. > > The drawback of this approach is that it does nothing to combat > fragmentation: People building their own loaders can still reimplement > different semantics for map defs, leading to programs that are tied to a > particular loader. So this would only work if we really believe BTF can > save us from this. I don't feel competent to comment on this just yet, > but thought I'd mention it :) > > -Toke [0] was supposed to be a reference to https://github.com/tohojo/libbpf/tree/iproute2-compat
On Thu, Aug 22, 2019 at 1:33 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 8/22/19 9:49 AM, Andrii Nakryiko wrote: > > On Wed, Aug 21, 2019 at 2:07 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > >>> On Tue, Aug 20, 2019 at 4:47 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >>>> > >>>> iproute2 uses its own bpf loader to load eBPF programs, which has > >>>> evolved separately from libbpf. Since we are now standardising on > >>>> libbpf, this becomes a problem as iproute2 is slowly accumulating > >>>> feature incompatibilities with libbpf-based loaders. In particular, > >>>> iproute2 has its own (expanded) version of the map definition struct, > >>>> which makes it difficult to write programs that can be loaded with both > >>>> custom loaders and iproute2. > >>>> > >>>> This series seeks to address this by converting iproute2 to using libbpf > >>>> for all its bpf needs. This version is an early proof-of-concept RFC, to > >>>> get some feedback on whether people think this is the right direction. > >>>> > >>>> What this series does is the following: > >>>> > >>>> - Updates the libbpf map definition struct to match that of iproute2 > >>>> (patch 1). > >>> > >>> Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm > >>> totally in support of making iproute2 use libbpf to load/initialize > >>> BPF programs. But I'm against adding iproute2-specific fields to > >>> libbpf's bpf_map_def definitions to support this. > >>> > >>> I've proposed the plan of extending libbpf's supported features so > >>> that it can be used to load iproute2-style BPF programs earlier, > >>> please see discussions in [0] and [1]. > >> > >> Yeah, I've seen that discussion, and agree that longer term this is > >> probably a better way to do map-in-map definitions. > >> > >> However, I view your proposal as complementary to this series: we'll > >> probably also want the BTF-based definition to work with iproute2, and > >> that means iproute2 needs to be ported to libbpf. But iproute2 needs to > >> be backwards compatible with the format it supports now, and, well, this > >> series is the simplest way to achieve that IMO :) > > > > Ok, I understand that. But I'd still want to avoid adding extra cruft > > to libbpf just for backwards-compatibility with *exact* iproute2 > > format. Libbpf as a whole is trying to move away from relying on > > binary bpf_map_def and into using BTF-defined map definitions, and > > this patch series is a step backwards in that regard, that adds, > > essentially, already outdated stuff that we'll need to support forever > > (I mean those extra fields in bpf_map_def, that will stay there > > forever). > > Agree, adding these extensions for libbpf would be a step backwards > compared to using BTF defined map defs. > > > We've discussed one way to deal with it, IMO, in a cleaner way. It can > > be done in few steps: > > > > 1. I originally wanted BTF-defined map definitions to ignore unknown > > fields. It shouldn't be a default mode, but it should be supported > > (and of course is very easy to add). So let's add that and let libbpf > > ignore unknown stuff. > > > > 2. Then to let iproute2 loader deal with backwards-compatibility for > > libbpf-incompatible bpf_elf_map, we need to "pass-through" all those > > fields so that users of libbpf (iproute2 loader, in this case) can > > make use of it. The easiest and cleanest way to do this is to expose > > BTF ID of a type describing each map entry and let iproute2 process > > that in whichever way it sees fit. > > > > Luckily, bpf_elf_map is compatible in `type` field, which will let > > libbpf recognize bpf_elf_map as map definition. All the rest setup > > will be done by iproute2, by processing BTF of bpf_elf_map, which will > > let it set up map sizes, flags and do all of its map-in-map magic. > > > > The only additions to libbpf in this case would be a new `__u32 > > bpf_map__btf_id(struct bpf_map* map);` API. > > > > I haven't written any code and haven't 100% checked that this will > > cover everything, but I think we should try. This will allow to let > > users of libbpf do custom stuff with map definitions without having to > > put all this extra logic into libbpf itself, which I think is > > desirable outcome. > > Sounds reasonable in general, but all this still has the issue that we're > assuming that BTF is /always/ present. Existing object files that would load > just fine /today/ but do not have BTF attached won't be handled here. Wouldn't > it be more straight forward to allow passing callbacks to the libbpf loader > such that if the map section is not found to be bpf_map_def compatible, we > rely on external user aka callback to parse the ELF section, handle any > non-default libbpf behavior like pinning/retrieving from BPF fs, populate > related internal libbpf map data structures and pass control back to libbpf Having all those special callbacks feels a bit too narrow-focused. I do agree that we need to provide enough flexibility to allow non-standard BPF loaders like iproute2 to adjust and/or extend some of BPF initialization logic, though. I'm just unsure if adding more and more callbacks is the right approach. Let's think slightly beyond iproute2, because iproute2 and libbpf use the same ELF section name ("maps") for non-BTF-defined map defs, which makes it easy to fall into the trap of designing too specific solution. Let's take, say, BCC. BCC uses one section per each map. And, unlike, iproute2, ELF section names don't coincide. So what if BCC were to use libbpf as an underlying BPF loader, while preserving its layout? This callback for "maps" section won't work at all. BTW, I realize BCC might not be the best example here, given on-the-fly complication nature of its programs and extensive usage of macro, which provide quite a lot of flexibility in changing ELF layout, if necessary, without changing user programs, but bear with me, let's pretend we can't change some of those aspects easily. The point I'm trying to make is that iproute2 and libbpf cases are examples of almost compatible ELF layouts, and if we try to solve just such case, we might end up inventing too specific solution. What seems like a bit more flexible and generic solution is to make libbpf API granular enough to allow users to adjust/add extra maps, programs, relocations, whatever is necessary, before libbpf does final loading or some other checks that would fail for normal libbpf programs, but are ok for custom BPF programs. So instead of having callbacks, we have API for each step, where custom loader can skip and/or adjust behavior of each of them, while also implement their own steps. E.g., today's API is essentially three steps: 1. open and parse ELF: collect relos, programs, map definitions 2. load: create maps from collected defs, do program/global data/CO-RE relocs, load and verify BPF programs 3. attach programs one by one. Between step 1 and 2 user has flexibility to create more maps, set up map-in-map, etc. Between 2 and 3 you can fill in global data, fill in tail call maps, etc. That's already pretty flexible. But we can tune and break apart those steps even further, if necessary. E.g., for iproute2 maps, I think the biggest problem is that we can't create a custom map dynamically, but still allow BPF instructions relocations against that map. Plus, of course, a name clash of "maps" ELF section with incompatible map definitions. So how about: 1. for open call, tell it to not parse "maps" section at all. We can model that as an extra option to override map definitions ELF section name. E.g., so that some applications can put their map defs into "my_fancy_map_def_section" ELF section, for instance. If that section is empty, though, libbpf will just skip step of collecting bpf_map_defs (it can still do BTF-defined maps, btw), allowing gradual migration. 2. provide API to add dynamically created (e.g., by iproute2 loader) maps that are "relocatable to", before program relocations happen. We'll need to figure out best interface here. Internally relocations refer to maps as section index + offset and translate that to map index. If we keep relos (internally) just as section index + offset, it will be simple and consistent interface to add external maps that can be relocated against, IMO. Map index is inconvenient in this case. BTW, we'd need to answer some of those questions regardless, even for callback-based solution. There are a bunch of details to be worked out, of course, and I don't have exact answer to all of them right now, but I think it's a worthwhile exercise to try to answer them and see how API would look like. As an aside, taking a step back and thinking about this whole API design thing, it occurred to me that ELF is just one way to specify programs, maps, relocations, global data, etc. But it doesn't have to be just ELF, right? What if we have a use case where we have "on-the-fly" creation of BPF program, with dynamically added maps, dynamically generated BPF programs, etc. E.g., think about bpftrace generating BPF object/program on-the-fly from their DSL language. Or some pcap to BPF translator for firewall rules, etc, etc. It might be inconvenient and unnecessary to generate ELF and then pass it to libbpf to load it. Instead it could be more convenient to create an empty bpf_object (bpf_object__new) and then use programmatic APIs to add a bunch of maps (some sort of bpf_object__add_map) and progs (bpf_object__add_program) with relocations against those maps, and let libbpf do all the low-level stuff (relos, CO-RE, etc). And provide high-level API for bpf_map, bpf_program, etc. How API would look like to support this? Today's bpf_object__open + bpf__object__load could be just a higher-level wrappers on top of those APIs, constructing all the entities from standardized ELF layout. But there still is low-level API to construct same bpf_object construct dynamically. That seems flexible and powerful and not tied to any particular use case, but of course requires a bit of thought about best API. Sorry for the wall of text :) Callback-based solutions always seem convoluted to me, as well as hard to follow and quite often too limited. This topic is not the easiest one to discuss over email, as well, maybe we should chat about this while at LPC? > loader afterwards. (Similar callback with prog section name handling for the > case where tail call maps get automatically populated.) I'm not sure I completely understand why we need this prog section name callback. Can you elaborate what problem does it solve that can't be solved with existing API? > > Thanks, > Daniel
On Wed, 21 Aug 2019 13:30:09 -0700 Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > On Tue, Aug 20, 2019 at 4:47 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > > iproute2 uses its own bpf loader to load eBPF programs, which has > > evolved separately from libbpf. Since we are now standardising on > > libbpf, this becomes a problem as iproute2 is slowly accumulating > > feature incompatibilities with libbpf-based loaders. In particular, > > iproute2 has its own (expanded) version of the map definition struct, > > which makes it difficult to write programs that can be loaded with both > > custom loaders and iproute2. > > > > This series seeks to address this by converting iproute2 to using libbpf > > for all its bpf needs. This version is an early proof-of-concept RFC, to > > get some feedback on whether people think this is the right direction. > > > > What this series does is the following: > > > > - Updates the libbpf map definition struct to match that of iproute2 > > (patch 1). > > > Hi Toke, > > Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm > totally in support of making iproute2 use libbpf to load/initialize > BPF programs. But I'm against adding iproute2-specific fields to > libbpf's bpf_map_def definitions to support this. > > I've proposed the plan of extending libbpf's supported features so > that it can be used to load iproute2-style BPF programs earlier, > please see discussions in [0] and [1]. I think instead of emulating > iproute2 way of matching everything based on user-specified internal > IDs, which doesn't provide good user experience and is quite easy to > get wrong, we should support same scenarios with better declarative > syntax and in a less error-prone way. I believe we can do that by > relying on BTF more heavily (again, please check some of my proposals > in [0], [1], and discussion with Daniel in those threads). It will > feel more natural and be more straightforward to follow. It would be > great if you can lend a hand in implementing pieces of that plan! > > I'm currently on vacation, so my availability is very sparse, but I'd > be happy to discuss this further, if need be. > > [0] https://lore.kernel.org/bpf/CAEf4BzbfdG2ub7gCi0OYqBrUoChVHWsmOntWAkJt47=FE+km+A@mail.gmail.com/ > [1] https://www.spinics.net/lists/bpf/msg03976.html > > > - Adds functionality to libbpf to support automatic pinning of maps when > > loading an eBPF program, while re-using pinned maps if they already > > exist (patches 2-3). For production use-cases, libbpf really need an easier higher-level API for re-using pinned maps, for establishing shared maps between programs. The existing libbpf API bpf_object__pin_maps() and bpf_object__unpin_maps(), which don't re-use pinned maps, are not really usable, because they pin/unpin ALL maps in the ELF file. What users really need is an easy way to specify, on a per map basis, what kind of pinning and reuse/sharing they want. E.g. like iproute2 have, "global", "object-scope", and "no-pinning". ("ifindex-scope" would be nice for XDP). Today users have to split/reimplement bpf_prog_load_xattr(), and use/add bpf_map__reuse_fd(). Which is that I ended doing for xdp-cpumap-tc[2] (used in production at ISP) resulting in 142 lines of extra code[3] that should have been hidden inside libbpf. And worse, in this solution[4] the maps for reuse-pinning is specified in the code by name. Thus, they cannot use a generic loader. That I why, I want to mark the maps via a pinning member, like iproute2. I really hope this moves in a practical direction, as I have the next production request lined up (also from an ISP), and I hate to have to advice them to choose the same route as [3]. [2] https://github.com/xdp-project/xdp-cpumap-tc/ [3] https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/xdp_iphash_to_cpu_user.c#L262-L403 [4] https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/xdp_iphash_to_cpu_user.c#L431-L441
[ ... snip ...] > E.g., today's API is essentially three steps: > > 1. open and parse ELF: collect relos, programs, map definitions > 2. load: create maps from collected defs, do program/global data/CO-RE > relocs, load and verify BPF programs > 3. attach programs one by one. > > Between step 1 and 2 user has flexibility to create more maps, set up > map-in-map, etc. Between 2 and 3 you can fill in global data, fill in > tail call maps, etc. That's already pretty flexible. But we can tune > and break apart those steps even further, if necessary. Today, steps 1 and 2 can be collapsed into a single call to bpf_prog_load_xattr(). As Jesper's mail explains, for XDP we don't generally want to do all the fancy rewriting stuff, we just want a simple way to load a program and get reusable pinning of maps. Preferably in a way that is compatible with the iproute2 loader. So I really think we need two things: (1) a flexible API that splits up all the various steps in a way that allows programs to inject their own map definitions before relocations and loading (2) a simple convenience wrapper that loads an object file, does something sensible with pinning and map-in-map definitions, and loads everything into the kernel. I'd go so far as to say that (2) should even support system-wide configuration, similar to the /etc/iproute2/bpf_pinning file. E.g., an /etc/libbpf/pinning.conf file that sets the default pinning directory, and makes it possible to set up pin-value-to-subdir mappings like what iproute2 does today. Having (2) makes it more likely that all the different custom loaders will be compatible with each other, while still allowing people to do their own custom thing with (1). And of course, (2) could be implemented in terms of (1) internally in libbpf. In my ideal world, (2) would just use the definition format already in iproute2 (this is basically what I implemented already), but if you guys don't want to put this into libbpf, I can probably live with the default format being BTF-based instead. Which would mean that iproute2 I would end up with a flow like this: - When given an elf file, try to run it through the "standard loader" (2). If this works, great, proceed to program attach. - If using (2) fails because it doesn't understand the map definition, fall back to a compatibility loader that parses the legacy iproute2 map definition format and uses (1) to load that. Does the above make sense? :) -Toke
On Fri, Aug 23, 2019 at 3:27 AM Jesper Dangaard Brouer <brouer@redhat.com> wrote: > > On Wed, 21 Aug 2019 13:30:09 -0700 > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > On Tue, Aug 20, 2019 at 4:47 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > > > > iproute2 uses its own bpf loader to load eBPF programs, which has > > > evolved separately from libbpf. Since we are now standardising on > > > libbpf, this becomes a problem as iproute2 is slowly accumulating > > > feature incompatibilities with libbpf-based loaders. In particular, > > > iproute2 has its own (expanded) version of the map definition struct, > > > which makes it difficult to write programs that can be loaded with both > > > custom loaders and iproute2. > > > > > > This series seeks to address this by converting iproute2 to using libbpf > > > for all its bpf needs. This version is an early proof-of-concept RFC, to > > > get some feedback on whether people think this is the right direction. > > > > > > What this series does is the following: > > > > > > - Updates the libbpf map definition struct to match that of iproute2 > > > (patch 1). > > > > > > Hi Toke, > > > > Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm > > totally in support of making iproute2 use libbpf to load/initialize > > BPF programs. But I'm against adding iproute2-specific fields to > > libbpf's bpf_map_def definitions to support this. > > > > I've proposed the plan of extending libbpf's supported features so > > that it can be used to load iproute2-style BPF programs earlier, > > please see discussions in [0] and [1]. I think instead of emulating > > iproute2 way of matching everything based on user-specified internal > > IDs, which doesn't provide good user experience and is quite easy to > > get wrong, we should support same scenarios with better declarative > > syntax and in a less error-prone way. I believe we can do that by > > relying on BTF more heavily (again, please check some of my proposals > > in [0], [1], and discussion with Daniel in those threads). It will > > feel more natural and be more straightforward to follow. It would be > > great if you can lend a hand in implementing pieces of that plan! > > > > I'm currently on vacation, so my availability is very sparse, but I'd > > be happy to discuss this further, if need be. > > > > [0] https://lore.kernel.org/bpf/CAEf4BzbfdG2ub7gCi0OYqBrUoChVHWsmOntWAkJt47=FE+km+A@mail.gmail.com/ > > [1] https://www.spinics.net/lists/bpf/msg03976.html > > > > > - Adds functionality to libbpf to support automatic pinning of maps when > > > loading an eBPF program, while re-using pinned maps if they already > > > exist (patches 2-3). > > For production use-cases, libbpf really need an easier higher-level API > for re-using pinned maps, for establishing shared maps between > programs. The existing libbpf API bpf_object__pin_maps() and > bpf_object__unpin_maps(), which don't re-use pinned maps, are not > really usable, because they pin/unpin ALL maps in the ELF file. > > What users really need is an easy way to specify, on a per map basis, > what kind of pinning and reuse/sharing they want. E.g. like iproute2 > have, "global", "object-scope", and "no-pinning". ("ifindex-scope" would > be nice for XDP). I totally agree and I think this is easy to add both for BTF-defined and "classic" bpf_map_def maps. Daniel mentioned in one of the previous threads that in practice object-scope doesn't seem to be used, so I'd say we should start with no-pinning + global pinning as two initial supported values for pinning attribute. ifindex-scope is interesting, but I'd love to hear a bit more about the use cases. > Today users have to split/reimplement bpf_prog_load_xattr(), and > use/add bpf_map__reuse_fd(). Which is that I ended doing for Honestly, bpf_prog_load_xattr() existence seems redundant to me. It's basically just bpf_object__open + bpf_object__load. There is a piece in the middle with "guessing" program types, but it should just be moved into bpf_object__open and happen by default. Using open + load gives more control and isn't really harder than bpf_prog_load_xattr. bpf_prog_load_xattr which might be slightly more convenient for simple use case, but falls apart immediately if you need to tune anything before load. > xdp-cpumap-tc[2] (used in production at ISP) resulting in 142 lines of > extra code[3] that should have been hidden inside libbpf. And worse, > in this solution[4] the maps for reuse-pinning is specified in the code > by name. Thus, they cannot use a generic loader. That I why, I want > to mark the maps via a pinning member, like iproute2. > > I really hope this moves in a practical direction, as I have the next > production request lined up (also from an ISP), and I hate to have to > advice them to choose the same route as [3]. It seems to me that map pinning doesn't need much discussion at this point, let's start with no-pinning + global pinning. To accommodate pinning at custom root, bpf_object__open_xattr should accept extra argument with non-default pinning root path. That should solve your case completely, shouldn't it? Ultimately, with BTF-defined maps it should be possible to specify custom pinning path on per-map basis for cases where user needs ultimate non-uniform manual control. > > > [2] https://github.com/xdp-project/xdp-cpumap-tc/ > [3] https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/xdp_iphash_to_cpu_user.c#L262-L403 > [4] https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/xdp_iphash_to_cpu_user.c#L431-L441 > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer
On Fri, Aug 23, 2019 at 4:29 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > [ ... snip ...] > > > E.g., today's API is essentially three steps: > > > > 1. open and parse ELF: collect relos, programs, map definitions > > 2. load: create maps from collected defs, do program/global data/CO-RE > > relocs, load and verify BPF programs > > 3. attach programs one by one. > > > > Between step 1 and 2 user has flexibility to create more maps, set up > > map-in-map, etc. Between 2 and 3 you can fill in global data, fill in > > tail call maps, etc. That's already pretty flexible. But we can tune > > and break apart those steps even further, if necessary. > > Today, steps 1 and 2 can be collapsed into a single call to > bpf_prog_load_xattr(). As Jesper's mail explains, for XDP we don't > generally want to do all the fancy rewriting stuff, we just want a > simple way to load a program and get reusable pinning of maps. I agree. See my response to Jesper's message. Note also my view of bpf_prog_load_xattr() existence. > Preferably in a way that is compatible with the iproute2 loader. > > So I really think we need two things: > > (1) a flexible API that splits up all the various steps in a way that > allows programs to inject their own map definitions before > relocations and loading > > (2) a simple convenience wrapper that loads an object file, does > something sensible with pinning and map-in-map definitions, and loads > everything into the kernel. I agree. I think this wrapper is bpf_object__open + bpf_object__load (bpf_prog_load_xattr will do as well, if you don't need to do anything between open and load). I think pinning is simple to add in minimal form and is pretty non-controversial (there is some ambiguity as to how to handle merging of prog array maps, or maybe not just prog array maps, but that can be controlled later through extra flags/attributes, so I'd start with something sensible as a default behavior). > > I'd go so far as to say that (2) should even support system-wide > configuration, similar to the /etc/iproute2/bpf_pinning file. E.g., an > /etc/libbpf/pinning.conf file that sets the default pinning directory, > and makes it possible to set up pin-value-to-subdir mappings like what > iproute2 does today. This I'm a bit hesitant about. It feels like it's not library's job to read some system-wide configs modifying its behavior. We have all those _xattr methods, which allow to override sensible defaults, I'd try to go as far as possible with just that before doing libbpf-specific /etc configs. > > Having (2) makes it more likely that all the different custom loaders > will be compatible with each other, while still allowing people to do > their own custom thing with (1). And of course, (2) could be implemented > in terms of (1) internally in libbpf. > > In my ideal world, (2) would just use the definition format already in > iproute2 (this is basically what I implemented already), but if you guys > don't want to put this into libbpf, I can probably live with the default I want to avoid having legacy-at-the-time-it-was-added code in libbpf that we'd need to support for a long time, that solves only iproute2 cases, which is why I'm pushing back. With BTF we can support same functionality in better form, which is what I want to prioritize and which will be beneficial to the whole BPF ecosystem. But I also want to make libbpf useful to iproute2 and other custom loaders that have to support existing formats, and thus my proposal to have libbpf provide granular enough APIs to augment default format in non-intrusive way. Should this be callback-based or not is secondary, though important to API design, concern. > format being BTF-based instead. Which would mean that iproute2 I would > end up with a flow like this: > > - When given an elf file, try to run it through the "standard loader" > (2). If this works, great, proceed to program attach. > > - If using (2) fails because it doesn't understand the map definition, > fall back to a compatibility loader that parses the legacy iproute2 > map definition format and uses (1) to load that. > > > Does the above make sense? :) It does, yes. Also, with BTF enabled it should be easy to distinguish between those two (e.g., was bpf_elf_map type used? if yes, then it's a compatibility format) and not do extra work. > > -Toke
On Wed, Aug 28, 2019 at 1:40 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Fri, Aug 23, 2019 at 4:29 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > > [ ... snip ...] > > > > > E.g., today's API is essentially three steps: > > > > > > 1. open and parse ELF: collect relos, programs, map definitions > > > 2. load: create maps from collected defs, do program/global data/CO-RE > > > relocs, load and verify BPF programs > > > 3. attach programs one by one. > > > > > > Between step 1 and 2 user has flexibility to create more maps, set up > > > map-in-map, etc. Between 2 and 3 you can fill in global data, fill in > > > tail call maps, etc. That's already pretty flexible. But we can tune > > > and break apart those steps even further, if necessary. > > > > Today, steps 1 and 2 can be collapsed into a single call to > > bpf_prog_load_xattr(). As Jesper's mail explains, for XDP we don't > > generally want to do all the fancy rewriting stuff, we just want a > > simple way to load a program and get reusable pinning of maps. > > I agree. See my response to Jesper's message. Note also my view of > bpf_prog_load_xattr() existence. > > > Preferably in a way that is compatible with the iproute2 loader. > > Hi Toke, I was wondering what's the state of converting iproute2 to use libbpf? Is this still something you (or someone else) interested to do? Briefly re-reading the thread, I think libbpf already has almost everything to be used by iproute2. You've added map pinning, so with bpf_map__set_pin_path() iproute2 should be able to specify pinning path, according to its own logic. The only thing missing that I can see is ability to specify numa_node, which we should add both to BTF-defined map definitions (trivial change), as well as probably expose a method like bpf_map__set_numa_node(struct bpf_map *map, int numa_node) for non-declarative and non-BTF legacy cases. There was concern about supporting "extended" bpf_map_def format of iproute2 (bpf_elf_map, actually) with extra fields. I think it's actually easy to handle as is without any extra new APIs. bpf_object__open() w/ .relaxed_maps = true option will process compatible 5 fields of bpf_map_def (type, key/value sizes, max_entries, and map_flags) and will set up corresponding struct bpf_map entries (but won't create BPF maps in kernel yet). Then iproute2 can iterate over "maps" ELF section on its own, and see which maps need to get some more adjustments before load phase: map-in-map set up, numa node, pinning, etc. All those adjustments can be done (except for numa yet) through existing libbpf APIs, as far as I can tell. Once that is taken care of, proceed to bpf_object__load() and other standard steps. No callbacks, no extra cruft. Is there anything else that can block iproute2 conversion to libbpf? BTW, I have a draft patches for declarative (BTF-based) map-in-map set up and initialization the way I described it at Plumbers last year. So while I'm finalizing that, thought I'll resurrect iproute2 thread and see if we can get iproute2 migration to libbpf started.
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Wed, Aug 28, 2019 at 1:40 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: >> >> On Fri, Aug 23, 2019 at 4:29 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> > >> > [ ... snip ...] >> > >> > > E.g., today's API is essentially three steps: >> > > >> > > 1. open and parse ELF: collect relos, programs, map definitions >> > > 2. load: create maps from collected defs, do program/global data/CO-RE >> > > relocs, load and verify BPF programs >> > > 3. attach programs one by one. >> > > >> > > Between step 1 and 2 user has flexibility to create more maps, set up >> > > map-in-map, etc. Between 2 and 3 you can fill in global data, fill in >> > > tail call maps, etc. That's already pretty flexible. But we can tune >> > > and break apart those steps even further, if necessary. >> > >> > Today, steps 1 and 2 can be collapsed into a single call to >> > bpf_prog_load_xattr(). As Jesper's mail explains, for XDP we don't >> > generally want to do all the fancy rewriting stuff, we just want a >> > simple way to load a program and get reusable pinning of maps. >> >> I agree. See my response to Jesper's message. Note also my view of >> bpf_prog_load_xattr() existence. >> >> > Preferably in a way that is compatible with the iproute2 loader. >> > > > Hi Toke, > > I was wondering what's the state of converting iproute2 to use libbpf? > Is this still something you (or someone else) interested to do? Yeah, it's still on my list; planning to circle back to it once I have finished an RFC implementation for XDP multiprog loading based on the new function-replacing in the kernel. (Not that this should keep anyone else from giving the conversion a go and beating me to it :)). > Briefly re-reading the thread, I think libbpf already has almost > everything to be used by iproute2. You've added map pinning, so with > bpf_map__set_pin_path() iproute2 should be able to specify pinning > path, according to its own logic. The only thing missing that I can > see is ability to specify numa_node, which we should add both to > BTF-defined map definitions (trivial change), as well as probably > expose a method like bpf_map__set_numa_node(struct bpf_map *map, int > numa_node) for non-declarative and non-BTF legacy cases. Yes, adding this to libbpf would be good. > There was concern about supporting "extended" bpf_map_def format of > iproute2 (bpf_elf_map, actually) with extra fields. I think it's > actually easy to handle as is without any extra new APIs. > bpf_object__open() w/ .relaxed_maps = true option will process > compatible 5 fields of bpf_map_def (type, key/value sizes, > max_entries, and map_flags) and will set up corresponding struct > bpf_map entries (but won't create BPF maps in kernel yet). Then > iproute2 can iterate over "maps" ELF section on its own, and see which > maps need to get some more adjustments before load phase: map-in-map > set up, numa node, pinning, etc. All those adjustments can be done > (except for numa yet) through existing libbpf APIs, as far as I can > tell. Once that is taken care of, proceed to bpf_object__load() and > other standard steps. No callbacks, no extra cruft. > > Is there anything else that can block iproute2 conversion to libbpf? I haven't looked into the details since my last RFC conversion series, but from what I recall from that, and what we've been changing in libbpf since, I was basically planning to do what you explained. So while there are some details to work out, I believe it's basically straight forward, and I can't think of anything that should block it. > BTW, I have a draft patches for declarative (BTF-based) map-in-map set > up and initialization the way I described it at Plumbers last year. So > while I'm finalizing that, thought I'll resurrect iproute2 thread and > see if we can get iproute2 migration to libbpf started. Great! FWIW, as long as we have the legacy compatibility code in iproute2, I don't think it'll be a problem (or a blocker for the conversion) if BTF-defined maps can't express all the same things as the legacy maps. The missing bits will come automatically as libbpf is updated. But great to hear that you're working on this :) -Toke
On Mon, Feb 3, 2020 at 11:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > On Wed, Aug 28, 2019 at 1:40 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > >> > >> On Fri, Aug 23, 2019 at 4:29 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > > >> > [ ... snip ...] > >> > > >> > > E.g., today's API is essentially three steps: > >> > > > >> > > 1. open and parse ELF: collect relos, programs, map definitions > >> > > 2. load: create maps from collected defs, do program/global data/CO-RE > >> > > relocs, load and verify BPF programs > >> > > 3. attach programs one by one. > >> > > > >> > > Between step 1 and 2 user has flexibility to create more maps, set up > >> > > map-in-map, etc. Between 2 and 3 you can fill in global data, fill in > >> > > tail call maps, etc. That's already pretty flexible. But we can tune > >> > > and break apart those steps even further, if necessary. > >> > > >> > Today, steps 1 and 2 can be collapsed into a single call to > >> > bpf_prog_load_xattr(). As Jesper's mail explains, for XDP we don't > >> > generally want to do all the fancy rewriting stuff, we just want a > >> > simple way to load a program and get reusable pinning of maps. > >> > >> I agree. See my response to Jesper's message. Note also my view of > >> bpf_prog_load_xattr() existence. > >> > >> > Preferably in a way that is compatible with the iproute2 loader. > >> > > > > > Hi Toke, > > > > I was wondering what's the state of converting iproute2 to use libbpf? > > Is this still something you (or someone else) interested to do? > > Yeah, it's still on my list; planning to circle back to it once I have > finished an RFC implementation for XDP multiprog loading based on the > new function-replacing in the kernel. > > (Not that this should keep anyone else from giving the conversion a go > and beating me to it :)). > > > Briefly re-reading the thread, I think libbpf already has almost > > everything to be used by iproute2. You've added map pinning, so with > > bpf_map__set_pin_path() iproute2 should be able to specify pinning > > path, according to its own logic. The only thing missing that I can > > see is ability to specify numa_node, which we should add both to > > BTF-defined map definitions (trivial change), as well as probably > > expose a method like bpf_map__set_numa_node(struct bpf_map *map, int > > numa_node) for non-declarative and non-BTF legacy cases. > > Yes, adding this to libbpf would be good. > > > There was concern about supporting "extended" bpf_map_def format of > > iproute2 (bpf_elf_map, actually) with extra fields. I think it's > > actually easy to handle as is without any extra new APIs. > > bpf_object__open() w/ .relaxed_maps = true option will process > > compatible 5 fields of bpf_map_def (type, key/value sizes, > > max_entries, and map_flags) and will set up corresponding struct > > bpf_map entries (but won't create BPF maps in kernel yet). Then > > iproute2 can iterate over "maps" ELF section on its own, and see which > > maps need to get some more adjustments before load phase: map-in-map > > set up, numa node, pinning, etc. All those adjustments can be done > > (except for numa yet) through existing libbpf APIs, as far as I can > > tell. Once that is taken care of, proceed to bpf_object__load() and > > other standard steps. No callbacks, no extra cruft. > > > > Is there anything else that can block iproute2 conversion to libbpf? > > I haven't looked into the details since my last RFC conversion series, > but from what I recall from that, and what we've been changing in libbpf > since, I was basically planning to do what you explained. So while there > are some details to work out, I believe it's basically straight forward, > and I can't think of anything that should block it. > Great! Just to disambiguate and make sure we are in agreement, my hope here is that iproute2 can completely delegate to libbpf all the ELF parsing, map creation, program loading, etc (including all the new stuff like global variables, etc). And only for legacy maps in SEC("maps"), it would have to parse that *single* ELF section (again, on its own) and see if there are any extra features of struct bpf_elf_map requested (i.e., numa, map-in-map, pinning), and if yes, it would use programmatic libbpf APIs to set this up. It might need to do additional BPF_PROG_ARRAY set up after BPF programs are loaded (because iproute2 has its custom naming-based convention). But hopefully we'll encourage people to gradually migrate to BTF-defined maps with declarative ways of doing all that. > > BTW, I have a draft patches for declarative (BTF-based) map-in-map set > > up and initialization the way I described it at Plumbers last year. So > > while I'm finalizing that, thought I'll resurrect iproute2 thread and > > see if we can get iproute2 migration to libbpf started. > > Great! FWIW, as long as we have the legacy compatibility code in > iproute2, I don't think it'll be a problem (or a blocker for the > conversion) if BTF-defined maps can't express all the same things as the > legacy maps. The missing bits will come automatically as libbpf is > updated. But great to hear that you're working on this :) > > -Toke >
On 2/3/20 5:56 PM, Andrii Nakryiko wrote: > Great! Just to disambiguate and make sure we are in agreement, my hope > here is that iproute2 can completely delegate to libbpf all the ELF > iproute2 needs to compile and continue working as is when libbpf is not available. e.g., add check in configure to define HAVE_LIBBPF and move the existing code and move under else branch.
On Mon, Feb 3, 2020 at 5:46 PM David Ahern <dsahern@gmail.com> wrote: > > On 2/3/20 5:56 PM, Andrii Nakryiko wrote: > > Great! Just to disambiguate and make sure we are in agreement, my hope > > here is that iproute2 can completely delegate to libbpf all the ELF > > > > iproute2 needs to compile and continue working as is when libbpf is not > available. e.g., add check in configure to define HAVE_LIBBPF and move > the existing code and move under else branch. Wouldn't it be better to statically compile against libbpf in this case and get rid a lot of BPF-related code and simplify the rest of it? This can be easily done by using libbpf through submodule, the same way as BCC and pahole do it.
On 2/3/20 8:41 PM, Andrii Nakryiko wrote: > On Mon, Feb 3, 2020 at 5:46 PM David Ahern <dsahern@gmail.com> wrote: >> >> On 2/3/20 5:56 PM, Andrii Nakryiko wrote: >>> Great! Just to disambiguate and make sure we are in agreement, my hope >>> here is that iproute2 can completely delegate to libbpf all the ELF >>> >> >> iproute2 needs to compile and continue working as is when libbpf is not >> available. e.g., add check in configure to define HAVE_LIBBPF and move >> the existing code and move under else branch. > > Wouldn't it be better to statically compile against libbpf in this > case and get rid a lot of BPF-related code and simplify the rest of > it? This can be easily done by using libbpf through submodule, the > same way as BCC and pahole do it. > iproute2 compiles today and runs on older distributions and older distributions with newer kernels. That needs to hold true after the move to libbpf.
On Mon, Feb 3, 2020 at 8:53 PM David Ahern <dsahern@gmail.com> wrote: > > On 2/3/20 8:41 PM, Andrii Nakryiko wrote: > > On Mon, Feb 3, 2020 at 5:46 PM David Ahern <dsahern@gmail.com> wrote: > >> > >> On 2/3/20 5:56 PM, Andrii Nakryiko wrote: > >>> Great! Just to disambiguate and make sure we are in agreement, my hope > >>> here is that iproute2 can completely delegate to libbpf all the ELF > >>> > >> > >> iproute2 needs to compile and continue working as is when libbpf is not > >> available. e.g., add check in configure to define HAVE_LIBBPF and move > >> the existing code and move under else branch. > > > > Wouldn't it be better to statically compile against libbpf in this > > case and get rid a lot of BPF-related code and simplify the rest of > > it? This can be easily done by using libbpf through submodule, the > > same way as BCC and pahole do it. > > > > iproute2 compiles today and runs on older distributions and older > distributions with newer kernels. That needs to hold true after the move > to libbpf. And by statically compiling against libbpf, checked out as a submodule, that will still hold true, wouldn't it? Or there is some complications I'm missing? Libbpf is designed to handle old kernels with no problems.
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Mon, Feb 3, 2020 at 8:53 PM David Ahern <dsahern@gmail.com> wrote: >> >> On 2/3/20 8:41 PM, Andrii Nakryiko wrote: >> > On Mon, Feb 3, 2020 at 5:46 PM David Ahern <dsahern@gmail.com> wrote: >> >> >> >> On 2/3/20 5:56 PM, Andrii Nakryiko wrote: >> >>> Great! Just to disambiguate and make sure we are in agreement, my hope >> >>> here is that iproute2 can completely delegate to libbpf all the ELF >> >>> >> >> >> >> iproute2 needs to compile and continue working as is when libbpf is not >> >> available. e.g., add check in configure to define HAVE_LIBBPF and move >> >> the existing code and move under else branch. >> > >> > Wouldn't it be better to statically compile against libbpf in this >> > case and get rid a lot of BPF-related code and simplify the rest of >> > it? This can be easily done by using libbpf through submodule, the >> > same way as BCC and pahole do it. >> > >> >> iproute2 compiles today and runs on older distributions and older >> distributions with newer kernels. That needs to hold true after the move >> to libbpf. > > And by statically compiling against libbpf, checked out as a > submodule, that will still hold true, wouldn't it? Or there is some > complications I'm missing? Libbpf is designed to handle old kernels > with no problems. My plan was to use the same configure test I'm using for xdp-tools (where I in turn copied the structure of the configure script from iproute2): https://github.com/xdp-project/xdp-tools/blob/master/configure#L59 This will look for a system libbpf install and compile against it if it is compatible, and otherwise fall back to a statically linking against a git submodule. We'll need to double-check that this will work on everything currently supported by iproute2, and fix libbpf if there are any issues with that. Not that I foresee any, but you never know :) -Toke
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Mon, Feb 3, 2020 at 11:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >> >> > On Wed, Aug 28, 2019 at 1:40 PM Andrii Nakryiko >> > <andrii.nakryiko@gmail.com> wrote: >> >> >> >> On Fri, Aug 23, 2019 at 4:29 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> > >> >> > [ ... snip ...] >> >> > >> >> > > E.g., today's API is essentially three steps: >> >> > > >> >> > > 1. open and parse ELF: collect relos, programs, map definitions >> >> > > 2. load: create maps from collected defs, do program/global data/CO-RE >> >> > > relocs, load and verify BPF programs >> >> > > 3. attach programs one by one. >> >> > > >> >> > > Between step 1 and 2 user has flexibility to create more maps, set up >> >> > > map-in-map, etc. Between 2 and 3 you can fill in global data, fill in >> >> > > tail call maps, etc. That's already pretty flexible. But we can tune >> >> > > and break apart those steps even further, if necessary. >> >> > >> >> > Today, steps 1 and 2 can be collapsed into a single call to >> >> > bpf_prog_load_xattr(). As Jesper's mail explains, for XDP we don't >> >> > generally want to do all the fancy rewriting stuff, we just want a >> >> > simple way to load a program and get reusable pinning of maps. >> >> >> >> I agree. See my response to Jesper's message. Note also my view of >> >> bpf_prog_load_xattr() existence. >> >> >> >> > Preferably in a way that is compatible with the iproute2 loader. >> >> > >> > >> > Hi Toke, >> > >> > I was wondering what's the state of converting iproute2 to use libbpf? >> > Is this still something you (or someone else) interested to do? >> >> Yeah, it's still on my list; planning to circle back to it once I have >> finished an RFC implementation for XDP multiprog loading based on the >> new function-replacing in the kernel. >> >> (Not that this should keep anyone else from giving the conversion a go >> and beating me to it :)). >> >> > Briefly re-reading the thread, I think libbpf already has almost >> > everything to be used by iproute2. You've added map pinning, so with >> > bpf_map__set_pin_path() iproute2 should be able to specify pinning >> > path, according to its own logic. The only thing missing that I can >> > see is ability to specify numa_node, which we should add both to >> > BTF-defined map definitions (trivial change), as well as probably >> > expose a method like bpf_map__set_numa_node(struct bpf_map *map, int >> > numa_node) for non-declarative and non-BTF legacy cases. >> >> Yes, adding this to libbpf would be good. >> >> > There was concern about supporting "extended" bpf_map_def format of >> > iproute2 (bpf_elf_map, actually) with extra fields. I think it's >> > actually easy to handle as is without any extra new APIs. >> > bpf_object__open() w/ .relaxed_maps = true option will process >> > compatible 5 fields of bpf_map_def (type, key/value sizes, >> > max_entries, and map_flags) and will set up corresponding struct >> > bpf_map entries (but won't create BPF maps in kernel yet). Then >> > iproute2 can iterate over "maps" ELF section on its own, and see which >> > maps need to get some more adjustments before load phase: map-in-map >> > set up, numa node, pinning, etc. All those adjustments can be done >> > (except for numa yet) through existing libbpf APIs, as far as I can >> > tell. Once that is taken care of, proceed to bpf_object__load() and >> > other standard steps. No callbacks, no extra cruft. >> > >> > Is there anything else that can block iproute2 conversion to libbpf? >> >> I haven't looked into the details since my last RFC conversion series, >> but from what I recall from that, and what we've been changing in libbpf >> since, I was basically planning to do what you explained. So while there >> are some details to work out, I believe it's basically straight forward, >> and I can't think of anything that should block it. >> > > Great! Just to disambiguate and make sure we are in agreement, my hope > here is that iproute2 can completely delegate to libbpf all the ELF > parsing, map creation, program loading, etc (including all the new > stuff like global variables, etc). And only for legacy maps in > SEC("maps"), it would have to parse that *single* ELF section (again, > on its own) and see if there are any extra features of struct > bpf_elf_map requested (i.e., numa, map-in-map, pinning), and if yes, > it would use programmatic libbpf APIs to set this up. It might need to > do additional BPF_PROG_ARRAY set up after BPF programs are loaded > (because iproute2 has its custom naming-based convention). But > hopefully we'll encourage people to gradually migrate to BTF-defined > maps with declarative ways of doing all that. Yup, that is my hope as well. Let's see how it goes :) -Toke
On Tue, Feb 4, 2020 at 12:25 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > On Mon, Feb 3, 2020 at 8:53 PM David Ahern <dsahern@gmail.com> wrote: > >> > >> On 2/3/20 8:41 PM, Andrii Nakryiko wrote: > >> > On Mon, Feb 3, 2020 at 5:46 PM David Ahern <dsahern@gmail.com> wrote: > >> >> > >> >> On 2/3/20 5:56 PM, Andrii Nakryiko wrote: > >> >>> Great! Just to disambiguate and make sure we are in agreement, my hope > >> >>> here is that iproute2 can completely delegate to libbpf all the ELF > >> >>> > >> >> > >> >> iproute2 needs to compile and continue working as is when libbpf is not > >> >> available. e.g., add check in configure to define HAVE_LIBBPF and move > >> >> the existing code and move under else branch. > >> > > >> > Wouldn't it be better to statically compile against libbpf in this > >> > case and get rid a lot of BPF-related code and simplify the rest of > >> > it? This can be easily done by using libbpf through submodule, the > >> > same way as BCC and pahole do it. > >> > > >> > >> iproute2 compiles today and runs on older distributions and older > >> distributions with newer kernels. That needs to hold true after the move > >> to libbpf. > > > > And by statically compiling against libbpf, checked out as a > > submodule, that will still hold true, wouldn't it? Or there is some > > complications I'm missing? Libbpf is designed to handle old kernels > > with no problems. > > My plan was to use the same configure test I'm using for xdp-tools > (where I in turn copied the structure of the configure script from > iproute2): > > https://github.com/xdp-project/xdp-tools/blob/master/configure#L59 > > This will look for a system libbpf install and compile against it if it > is compatible, and otherwise fall back to a statically linking against a > git submodule. How will this work when build host has libbpf installed, but target host doesn't? You'll get dynamic linker error when trying to run that tool. If the goal is to have a reliable tool working everywhere, and you already support having libbpf as a submodule, why not always use submodule's libbpf? What's the concern? Libbpf is a small library, I don't think a binary size argument is enough reason to not do this. On the other hand, by using libbpf from submodule, your tool is built *and tested* with a well-known libbpf version that tool-producer controls. > > We'll need to double-check that this will work on everything currently > supported by iproute2, and fix libbpf if there are any issues with that. > Not that I foresee any, but you never know :) > > -Toke >
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Tue, Feb 4, 2020 at 12:25 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >> >> > On Mon, Feb 3, 2020 at 8:53 PM David Ahern <dsahern@gmail.com> wrote: >> >> >> >> On 2/3/20 8:41 PM, Andrii Nakryiko wrote: >> >> > On Mon, Feb 3, 2020 at 5:46 PM David Ahern <dsahern@gmail.com> wrote: >> >> >> >> >> >> On 2/3/20 5:56 PM, Andrii Nakryiko wrote: >> >> >>> Great! Just to disambiguate and make sure we are in agreement, my hope >> >> >>> here is that iproute2 can completely delegate to libbpf all the ELF >> >> >>> >> >> >> >> >> >> iproute2 needs to compile and continue working as is when libbpf is not >> >> >> available. e.g., add check in configure to define HAVE_LIBBPF and move >> >> >> the existing code and move under else branch. >> >> > >> >> > Wouldn't it be better to statically compile against libbpf in this >> >> > case and get rid a lot of BPF-related code and simplify the rest of >> >> > it? This can be easily done by using libbpf through submodule, the >> >> > same way as BCC and pahole do it. >> >> > >> >> >> >> iproute2 compiles today and runs on older distributions and older >> >> distributions with newer kernels. That needs to hold true after the move >> >> to libbpf. >> > >> > And by statically compiling against libbpf, checked out as a >> > submodule, that will still hold true, wouldn't it? Or there is some >> > complications I'm missing? Libbpf is designed to handle old kernels >> > with no problems. >> >> My plan was to use the same configure test I'm using for xdp-tools >> (where I in turn copied the structure of the configure script from >> iproute2): >> >> https://github.com/xdp-project/xdp-tools/blob/master/configure#L59 >> >> This will look for a system libbpf install and compile against it if it >> is compatible, and otherwise fall back to a statically linking against a >> git submodule. > > How will this work when build host has libbpf installed, but target > host doesn't? You'll get dynamic linker error when trying to run that > tool. That's called dependency tracking; distros have various ways of going about that :) But yeah, if you're going to do you own cross-compilation, you'd probably want to just force using the static library. > If the goal is to have a reliable tool working everywhere, and you > already support having libbpf as a submodule, why not always use > submodule's libbpf? What's the concern? Libbpf is a small library, I > don't think a binary size argument is enough reason to not do this. On > the other hand, by using libbpf from submodule, your tool is built > *and tested* with a well-known libbpf version that tool-producer > controls. I thought we already had this discussion? :) libbpf is a library like any other. Distros that package the library want the tools that use it to be dynamically linked against it so library upgrades (especially of the CVE-fixing kind) get picked up by all users. Other distros have memory and space constraints (iproute2 is shipped on OpenWrt, for instance, which is *extremely* space-constrained). And yeah, other deployments don't care and will just statically compile in the vendored version. So we'll need to support all of those use cases. -Toke
On Tue, Feb 4, 2020 at 11:19 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > On Tue, Feb 4, 2020 at 12:25 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > >> > >> > On Mon, Feb 3, 2020 at 8:53 PM David Ahern <dsahern@gmail.com> wrote: > >> >> > >> >> On 2/3/20 8:41 PM, Andrii Nakryiko wrote: > >> >> > On Mon, Feb 3, 2020 at 5:46 PM David Ahern <dsahern@gmail.com> wrote: > >> >> >> > >> >> >> On 2/3/20 5:56 PM, Andrii Nakryiko wrote: > >> >> >>> Great! Just to disambiguate and make sure we are in agreement, my hope > >> >> >>> here is that iproute2 can completely delegate to libbpf all the ELF > >> >> >>> > >> >> >> > >> >> >> iproute2 needs to compile and continue working as is when libbpf is not > >> >> >> available. e.g., add check in configure to define HAVE_LIBBPF and move > >> >> >> the existing code and move under else branch. > >> >> > > >> >> > Wouldn't it be better to statically compile against libbpf in this > >> >> > case and get rid a lot of BPF-related code and simplify the rest of > >> >> > it? This can be easily done by using libbpf through submodule, the > >> >> > same way as BCC and pahole do it. > >> >> > > >> >> > >> >> iproute2 compiles today and runs on older distributions and older > >> >> distributions with newer kernels. That needs to hold true after the move > >> >> to libbpf. > >> > > >> > And by statically compiling against libbpf, checked out as a > >> > submodule, that will still hold true, wouldn't it? Or there is some > >> > complications I'm missing? Libbpf is designed to handle old kernels > >> > with no problems. > >> > >> My plan was to use the same configure test I'm using for xdp-tools > >> (where I in turn copied the structure of the configure script from > >> iproute2): > >> > >> https://github.com/xdp-project/xdp-tools/blob/master/configure#L59 > >> > >> This will look for a system libbpf install and compile against it if it > >> is compatible, and otherwise fall back to a statically linking against a > >> git submodule. > > > > How will this work when build host has libbpf installed, but target > > host doesn't? You'll get dynamic linker error when trying to run that > > tool. > > That's called dependency tracking; distros have various ways of going > about that :) I'm confused, honestly. libbpf is either a dependency and thus can be relied upon to be present in the target system, or it's not and this whole dance with detecting libbpf presence needs to be performed. If libbpf is optional, then I don't see how iproute2 BPF-related code and complexity can be reduced at all, given it should still support loading BPF programs even without libbpf. Furthermore, given libbpf supports more features already and will probably be outpacing iproute2's own BPF support in the future, some users will start relying on BPF features supported only by libbpf "backend", so iproute2's own BPF backend will just fail to load such programs, bringing unpleasant surprises, potentially. So I still fail to see how libbpf can be optional and what benefit does that bring. But shared or static - whatever fits iproute2 best, no preferences. > > But yeah, if you're going to do you own cross-compilation, you'd > probably want to just force using the static library. > > > If the goal is to have a reliable tool working everywhere, and you > > already support having libbpf as a submodule, why not always use > > submodule's libbpf? What's the concern? Libbpf is a small library, I > > don't think a binary size argument is enough reason to not do this. On > > the other hand, by using libbpf from submodule, your tool is built > > *and tested* with a well-known libbpf version that tool-producer > > controls. > > I thought we already had this discussion? :) Yeah, I guess we did for bpftool :) No worries, I don't care that much, just see my notes above about optional libbpf and consequences. > > libbpf is a library like any other. Distros that package the library > want the tools that use it to be dynamically linked against it so > library upgrades (especially of the CVE-fixing kind) get picked up by > all users. Other distros have memory and space constraints (iproute2 is > shipped on OpenWrt, for instance, which is *extremely* > space-constrained). And yeah, other deployments don't care and will just > statically compile in the vendored version. So we'll need to support all > of those use cases. > > -Toke >
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Tue, Feb 4, 2020 at 11:19 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >> >> > On Tue, Feb 4, 2020 at 12:25 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> >> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >> >> >> >> > On Mon, Feb 3, 2020 at 8:53 PM David Ahern <dsahern@gmail.com> wrote: >> >> >> >> >> >> On 2/3/20 8:41 PM, Andrii Nakryiko wrote: >> >> >> > On Mon, Feb 3, 2020 at 5:46 PM David Ahern <dsahern@gmail.com> wrote: >> >> >> >> >> >> >> >> On 2/3/20 5:56 PM, Andrii Nakryiko wrote: >> >> >> >>> Great! Just to disambiguate and make sure we are in agreement, my hope >> >> >> >>> here is that iproute2 can completely delegate to libbpf all the ELF >> >> >> >>> >> >> >> >> >> >> >> >> iproute2 needs to compile and continue working as is when libbpf is not >> >> >> >> available. e.g., add check in configure to define HAVE_LIBBPF and move >> >> >> >> the existing code and move under else branch. >> >> >> > >> >> >> > Wouldn't it be better to statically compile against libbpf in this >> >> >> > case and get rid a lot of BPF-related code and simplify the rest of >> >> >> > it? This can be easily done by using libbpf through submodule, the >> >> >> > same way as BCC and pahole do it. >> >> >> > >> >> >> >> >> >> iproute2 compiles today and runs on older distributions and older >> >> >> distributions with newer kernels. That needs to hold true after the move >> >> >> to libbpf. >> >> > >> >> > And by statically compiling against libbpf, checked out as a >> >> > submodule, that will still hold true, wouldn't it? Or there is some >> >> > complications I'm missing? Libbpf is designed to handle old kernels >> >> > with no problems. >> >> >> >> My plan was to use the same configure test I'm using for xdp-tools >> >> (where I in turn copied the structure of the configure script from >> >> iproute2): >> >> >> >> https://github.com/xdp-project/xdp-tools/blob/master/configure#L59 >> >> >> >> This will look for a system libbpf install and compile against it if it >> >> is compatible, and otherwise fall back to a statically linking against a >> >> git submodule. >> > >> > How will this work when build host has libbpf installed, but target >> > host doesn't? You'll get dynamic linker error when trying to run that >> > tool. >> >> That's called dependency tracking; distros have various ways of going >> about that :) > > I'm confused, honestly. libbpf is either a dependency and thus can be > relied upon to be present in the target system, or it's not and this > whole dance with detecting libbpf presence needs to be performed. Yes, and iproute2 is likely to be built in both sorts of environments, so we will have to support both :) > If libbpf is optional, then I don't see how iproute2 BPF-related code > and complexity can be reduced at all, given it should still support > loading BPF programs even without libbpf. Furthermore, given libbpf > supports more features already and will probably be outpacing > iproute2's own BPF support in the future, some users will start > relying on BPF features supported only by libbpf "backend", so > iproute2's own BPF backend will just fail to load such programs, > bringing unpleasant surprises, potentially. So I still fail to see how > libbpf can be optional and what benefit does that bring. I wasn't saying that libbpf itself should be optional; if we're porting things, we should rip out as much of the old code as we can. I just meant that we should support both modes of building, so distros that *do* build libbpf as a library can link iproute2 against that with as little friction as possible. I'm dead set on a specific auto-detection semantic either; I guess it'll be up to the iproute2 maintainers whether they prefer defaulting to one or the other. > But shared or static - whatever fits iproute2 best, no preferences. Right, cool, I think we are basically agreed, given the above :) -Toke
On 2/4/20 2:56 PM, Toke Høiland-Jørgensen wrote: >> I'm confused, honestly. libbpf is either a dependency and thus can be >> relied upon to be present in the target system, or it's not and this >> whole dance with detecting libbpf presence needs to be performed. > > Yes, and iproute2 is likely to be built in both sorts of environments, > so we will have to support both :) > >> If libbpf is optional, then I don't see how iproute2 BPF-related code >> and complexity can be reduced at all, given it should still support >> loading BPF programs even without libbpf. Furthermore, given libbpf >> supports more features already and will probably be outpacing >> iproute2's own BPF support in the future, some users will start >> relying on BPF features supported only by libbpf "backend", so >> iproute2's own BPF backend will just fail to load such programs, >> bringing unpleasant surprises, potentially. So I still fail to see how >> libbpf can be optional and what benefit does that bring. > > I wasn't saying that libbpf itself should be optional; if we're porting > things, we should rip out as much of the old code as we can. I just > meant that we should support both modes of building, so distros that > *do* build libbpf as a library can link iproute2 against that with as > little friction as possible. > > I'm dead set on a specific auto-detection semantic either; I guess it'll > be up to the iproute2 maintainers whether they prefer defaulting to one > or the other. > A few concerns from my perspective: 1. Right now ip comes in around 650k unstripped; libbpf.a for 0.0.7 is around 1.2M with the size of libbpf.o > than ip. Most likely, making iproute2 use libbpf statically is going to be challenging and I am not sure it is the right thing to do (unless the user is building a static version of iproute2 commands). 2. git submodules can be a PITA to deal with (e.g., jumping between branches and versions), so there needs to be a good reason for it. 3. iproute2 code needs to build for a wide range of OSes and not lose functionality compared to what it has today.
David Ahern <dsahern@gmail.com> writes: > On 2/4/20 2:56 PM, Toke Høiland-Jørgensen wrote: >>> I'm confused, honestly. libbpf is either a dependency and thus can be >>> relied upon to be present in the target system, or it's not and this >>> whole dance with detecting libbpf presence needs to be performed. >> >> Yes, and iproute2 is likely to be built in both sorts of environments, >> so we will have to support both :) >> >>> If libbpf is optional, then I don't see how iproute2 BPF-related code >>> and complexity can be reduced at all, given it should still support >>> loading BPF programs even without libbpf. Furthermore, given libbpf >>> supports more features already and will probably be outpacing >>> iproute2's own BPF support in the future, some users will start >>> relying on BPF features supported only by libbpf "backend", so >>> iproute2's own BPF backend will just fail to load such programs, >>> bringing unpleasant surprises, potentially. So I still fail to see how >>> libbpf can be optional and what benefit does that bring. >> >> I wasn't saying that libbpf itself should be optional; if we're porting >> things, we should rip out as much of the old code as we can. I just >> meant that we should support both modes of building, so distros that >> *do* build libbpf as a library can link iproute2 against that with as >> little friction as possible. >> >> I'm dead set on a specific auto-detection semantic either; I guess it'll >> be up to the iproute2 maintainers whether they prefer defaulting to one >> or the other. >> > > A few concerns from my perspective: > > 1. Right now ip comes in around 650k unstripped; libbpf.a for 0.0.7 is > around 1.2M with the size of libbpf.o > than ip. Hmm, I'm getting ~700k for libbpf.a and libbpf.so.0.0.7 is ~480k (for whichever kernel I currently have checked out). But lib/bpf.o in iproute2 is only 80k, so fair point :) > Most likely, making iproute2 use libbpf statically is going to be > challenging and I am not sure it is the right thing to do (unless the > user is building a static version of iproute2 commands). Linking dynamically would imply a new dependency. I'm not necessarily against that, but would it be acceptable from your PoV? And if so, should we keep the current internal BPF code for when libbpf is not available, or would it be acceptable to not be able to load BPF programs if libbpf is not present (similar to how the libelf dependency works today)? > 2. git submodules can be a PITA to deal with (e.g., jumping between > branches and versions), so there needs to be a good reason for it. Yes, totally with you on that. Another option could be to just copy the files into the iproute2 tree, and update them the same way the kernel headers are? Or maybe doing fancy things like this: https://github.com/apenwarr/git-subtrac > 3. iproute2 code needs to build for a wide range of OSes and not lose > functionality compared to what it has today. Could you be a bit more specific about "a wide range of OSes"? I guess we could do the work to make sure libbpf builds on all the same platforms iproute2 supports, but we'd need something a bit more definite to go on... -Toke
On 2/4/20 3:35 PM, Toke Høiland-Jørgensen wrote: > >> Most likely, making iproute2 use libbpf statically is going to be >> challenging and I am not sure it is the right thing to do (unless the >> user is building a static version of iproute2 commands). > > Linking dynamically would imply a new dependency. I'm not necessarily > against that, but would it be acceptable from your PoV? And if so, > should we keep the current internal BPF code for when libbpf is not > available, or would it be acceptable to not be able to load BPF programs > if libbpf is not present (similar to how the libelf dependency works > today)? iproute2 recently gained the libmnl dependency for extack. Seems like libbpf falls into the similar category. > >> 2. git submodules can be a PITA to deal with (e.g., jumping between >> branches and versions), so there needs to be a good reason for it. > > Yes, totally with you on that. Another option could be to just copy the > files into the iproute2 tree, and update them the same way the kernel > headers are? Or maybe doing fancy things like this: > https://github.com/apenwarr/git-subtrac kernel uapi is a totally different reason to import the headers. bpf functionality is an add-on. I would like to see iproute2 work with libbpf. Given libbpf's current status and availability across OS'es that is going to be a challenge for a lot of OS'es which is why I suggested the HAVE_LIBBPF check falls back to existing code if libbpf is not installed. > >> 3. iproute2 code needs to build for a wide range of OSes and not lose >> functionality compared to what it has today. > > Could you be a bit more specific about "a wide range of OSes"? I guess > we could do the work to make sure libbpf builds on all the same > platforms iproute2 supports, but we'd need something a bit more definite > to go on... > rhel5/centos5? definitely rhel6/centos6 time frame and forward. Stephen: has the backwards lifetime ever been stated? Changing configure to check for existence and fall back to existing code seems to me the safest option.
David Ahern <dsahern@gmail.com> writes: > On 2/4/20 3:35 PM, Toke Høiland-Jørgensen wrote: >> >>> Most likely, making iproute2 use libbpf statically is going to be >>> challenging and I am not sure it is the right thing to do (unless the >>> user is building a static version of iproute2 commands). >> >> Linking dynamically would imply a new dependency. I'm not necessarily >> against that, but would it be acceptable from your PoV? And if so, >> should we keep the current internal BPF code for when libbpf is not >> available, or would it be acceptable to not be able to load BPF programs >> if libbpf is not present (similar to how the libelf dependency works >> today)? > > iproute2 recently gained the libmnl dependency for extack. Seems like > libbpf falls into the similar category. > >> >>> 2. git submodules can be a PITA to deal with (e.g., jumping between >>> branches and versions), so there needs to be a good reason for it. >> >> Yes, totally with you on that. Another option could be to just copy the >> files into the iproute2 tree, and update them the same way the kernel >> headers are? Or maybe doing fancy things like this: >> https://github.com/apenwarr/git-subtrac > > kernel uapi is a totally different reason to import the headers. bpf > functionality is an add-on. > > I would like to see iproute2 work with libbpf. Given libbpf's current > status and availability across OS'es that is going to be a challenge for > a lot of OS'es which is why I suggested the HAVE_LIBBPF check falls back > to existing code if libbpf is not installed. Sure, can do. -Toke