mbox series

[PATCHv5,iproute2-next,0/5] iproute2: add libbpf support

Message ID 20201116065305.1010651-1-haliu@redhat.com
Headers show
Series iproute2: add libbpf support | expand

Message

Hangbin Liu Nov. 16, 2020, 6:53 a.m. UTC
This series converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available. This means that iproute2 will
correctly process BTF information and support the new-style BTF-defined
maps, while keeping compatibility with the old internal map definition
syntax.

This is achieved by checking for libbpf at './configure' time, and using
it if available. By default the system libbpf will be used, but static
linking against a custom libbpf version can be achieved by passing
LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
abort if no suitable libbpf is found (useful for automatic packaging
that wants to enforce the dependency), or set off to disable libbpf check
and build iproute2 with legacy bpf.

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code ensures that iproute2 will
still understand the old map definition format, including populating
map-in-map and tail call maps before load.

The examples in bpf/examples are kept, and a separate set of examples
are added with BTF-based map definitions for those examples where this
is possible (libbpf doesn't currently support declaratively populating
tail call maps).

At last, Thanks a lot for Toke's help on this patch set.

v5:
a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR
   dest.
b) Fix bpf_prog_load_dev typo.
c) rebase to latest iproute2-next.

v4:
a) Make variable LIBBPF_FORCE able to control whether build iproute2
   with libbpf or not.
b) Add new file bpf_glue.c to for libbpf/legacy mixed bpf calls.
c) Fix some build issues and shell compatibility error.

v3:
a) Update configure to Check function bpf_program__section_name() separately
b) Add a new function get_bpf_program__section_name() to choose whether to
use bpf_program__title() or not.
c) Test build the patch on Fedora 33 with libbpf-0.1.0-1.fc33 and
   libbpf-devel-0.1.0-1.fc33

v2:
a) Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
b) Add ipvrf with libbpf support.


Here are the test results with patched iproute2:

== setup env
# clang -O2 -Wall -g -target bpf -c bpf_graft.c -o btf_graft.o
# clang -O2 -Wall -g -target bpf -c bpf_map_in_map.c -o btf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c bpf_shared.c -o btf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_cyclic.c -o bpf_cyclic.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_graft.c -o bpf_graft.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_map_in_map.c -o bpf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_shared.c -o bpf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_tailcall.c -o bpf_tailcall.o
# rm -rf /sys/fs/bpf/xdp/globals
# /root/iproute2/ip/ip link add type veth
# /root/iproute2/ip/ip link set veth0 up
# /root/iproute2/ip/ip link set veth1 up


== Load objs
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 4 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
4: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:21-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 5
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 8 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
8: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:23-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 10
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 12 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
12: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:25-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 15
# /root/iproute2/ip/ip link set veth0 xdp off


== Load objs again to make sure maps could be reused
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 16 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
16: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:27-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 20
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 20 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show                                                                                                                                                                   [236/4518]
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
20: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:29-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 25
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 24 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
24: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:31-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 30
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals

== Testing if we can load new-style objects (using xdp-filter as an example)
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_all.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 28 tag e29eeda1489a6520 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
28: xdp  name xdpfilt_alw_all  tag e29eeda1489a6520  gpl
        loaded_at 2020-10-22T08:04:33-0400  uid 0
        xlated 2408B  jited 1405B  memlock 4096B  map_ids 9,5,7,8,6
        btf_id 35
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_ip.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 32 tag 2f2b9dbfb786a5a2 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
32: xdp  name xdpfilt_alw_ip  tag 2f2b9dbfb786a5a2  gpl
        loaded_at 2020-10-22T08:04:35-0400  uid 0
        xlated 1336B  jited 778B  memlock 4096B  map_ids 7,8,5
        btf_id 40
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_tcp.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 36 tag 18c1bb25084030bc jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
36: xdp  name xdpfilt_alw_tcp  tag 18c1bb25084030bc  gpl
        loaded_at 2020-10-22T08:04:37-0400  uid 0
        xlated 1128B  jited 690B  memlock 4096B  map_ids 6,5
        btf_id 45
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Load new btf defined maps
# /root/iproute2/ip/ip link set veth0 xdp obj btf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 40 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
40: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:39-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 50
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 44 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
44: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:41-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 13
        btf_id 55
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 48 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer  map_sh
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
14: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
48: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:43-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 14
        btf_id 60
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Test load objs by tc
# /root/iproute2/tc/tc qdisc add dev veth0 ingress
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_cyclic.o sec 0xabccba/0
# /root/iproute2/tc/tc filter add dev veth0 parent ffff: bpf obj bpf_graft.o
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/1
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 43/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec classifier
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
# ls /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d:
jmp_tc

/sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f:
jmp_ex  jmp_tc  map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc
# bpftool map show
15: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
16: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
17: prog_array  name jmp_ex  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
18: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 2  memlock 4096B
        owner_prog_type sched_cls  owner jited
19: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
52: sched_cls  name cls_loop  tag 3e98a40b04099d36  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 168B  jited 133B  memlock 4096B  map_ids 15
        btf_id 65
56: sched_cls  name cls_entry  tag 0fbb4d9310a6ee26  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 144B  jited 121B  memlock 4096B  map_ids 16
        btf_id 70
60: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 75
66: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 80
72: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 85
78: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 90
79: sched_cls  name cls_case2  tag ee218ff893dca823  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 336B  jited 218B  memlock 4096B  map_ids 19,18
        btf_id 90
80: sched_cls  name cls_exit  tag e78a58140deed387  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 288B  jited 177B  memlock 4096B  map_ids 19
        btf_id 90

I also run the following upstream kselftest with patches iproute2 and
all passed.

test_lwt_ip_encap.sh
test_xdp_redirect.sh
test_tc_redirect.sh
test_xdp_meta.sh
test_xdp_veth.sh
test_xdp_vlan.sh

Hangbin Liu (5):
  configure: add check_libbpf() for later libbpf support
  lib: rename bpf.c to bpf_legacy.c
  lib: add libbpf support
  examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  examples/bpf: add bpf examples with BTF defined maps

 configure                                | 108 +++++++
 examples/bpf/README                      |  18 +-
 examples/bpf/bpf_graft.c                 |  14 +-
 examples/bpf/bpf_map_in_map.c            |  37 ++-
 examples/bpf/bpf_shared.c                |  14 +-
 examples/bpf/{ => legacy}/bpf_cyclic.c   |   2 +-
 examples/bpf/legacy/bpf_graft.c          |  66 +++++
 examples/bpf/legacy/bpf_map_in_map.c     |  56 ++++
 examples/bpf/legacy/bpf_shared.c         |  53 ++++
 examples/bpf/{ => legacy}/bpf_tailcall.c |   2 +-
 include/bpf_api.h                        |  13 +
 include/bpf_util.h                       |  21 +-
 ip/ipvrf.c                               |   6 +-
 lib/Makefile                             |   6 +-
 lib/bpf_glue.c                           |  35 +++
 lib/{bpf.c => bpf_legacy.c}              | 193 ++++++++++++-
 lib/bpf_libbpf.c                         | 353 +++++++++++++++++++++++
 17 files changed, 939 insertions(+), 58 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 create mode 100644 examples/bpf/legacy/bpf_graft.c
 create mode 100644 examples/bpf/legacy/bpf_map_in_map.c
 create mode 100644 examples/bpf/legacy/bpf_shared.c
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)
 create mode 100644 lib/bpf_glue.c
 rename lib/{bpf.c => bpf_legacy.c} (94%)
 create mode 100644 lib/bpf_libbpf.c

Comments

Alexei Starovoitov Nov. 16, 2020, 7:19 a.m. UTC | #1
On Sun, Nov 15, 2020 at 10:56 PM Hangbin Liu <haliu@redhat.com> wrote:
>
> This series converts iproute2 to use libbpf for loading and attaching
> BPF programs when it is available. This means that iproute2 will
> correctly process BTF information and support the new-style BTF-defined
> maps, while keeping compatibility with the old internal map definition
> syntax.
>
> This is achieved by checking for libbpf at './configure' time, and using
> it if available. By default the system libbpf will be used, but static
> linking against a custom libbpf version can be achieved by passing
> LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
> abort if no suitable libbpf is found (useful for automatic packaging
> that wants to enforce the dependency), or set off to disable libbpf check
> and build iproute2 with legacy bpf.
>
> The old iproute2 bpf code is kept and will be used if no suitable libbpf
> is available. When using libbpf, wrapper code ensures that iproute2 will
> still understand the old map definition format, including populating
> map-in-map and tail call maps before load.
>
> The examples in bpf/examples are kept, and a separate set of examples
> are added with BTF-based map definitions for those examples where this
> is possible (libbpf doesn't currently support declaratively populating
> tail call maps).
>
> At last, Thanks a lot for Toke's help on this patch set.
>
> v5:
> a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR
>    dest.
> b) Fix bpf_prog_load_dev typo.
> c) rebase to latest iproute2-next.

For the reasons explained multiple times earlier:
Nacked-by: Alexei Starovoitov <ast@kernel.org>
Jesper Dangaard Brouer Nov. 16, 2020, 2:54 p.m. UTC | #2
On Sun, 15 Nov 2020 23:19:26 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Sun, Nov 15, 2020 at 10:56 PM Hangbin Liu <haliu@redhat.com> wrote:
> >
> > This series converts iproute2 to use libbpf for loading and attaching
> > BPF programs when it is available. This means that iproute2 will
> > correctly process BTF information and support the new-style BTF-defined
> > maps, while keeping compatibility with the old internal map definition
> > syntax.
> >
> > This is achieved by checking for libbpf at './configure' time, and using
> > it if available. By default the system libbpf will be used, but static
> > linking against a custom libbpf version can be achieved by passing
> > LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
> > abort if no suitable libbpf is found (useful for automatic packaging
> > that wants to enforce the dependency), or set off to disable libbpf check
> > and build iproute2 with legacy bpf.
> >
> > The old iproute2 bpf code is kept and will be used if no suitable libbpf
> > is available. When using libbpf, wrapper code ensures that iproute2 will
> > still understand the old map definition format, including populating
> > map-in-map and tail call maps before load.
> >
> > The examples in bpf/examples are kept, and a separate set of examples
> > are added with BTF-based map definitions for those examples where this
> > is possible (libbpf doesn't currently support declaratively populating
> > tail call maps).
> >
> > At last, Thanks a lot for Toke's help on this patch set.
> >
> > v5:
> > a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR
> >    dest.
> > b) Fix bpf_prog_load_dev typo.
> > c) rebase to latest iproute2-next.  
> 
> For the reasons explained multiple times earlier:
> Nacked-by: Alexei Starovoitov <ast@kernel.org>

We really need to get another BPF-ELF loaded into iproute2.  I have
done a number of practical projects with TC-BPF and it sucks that
iproute2 have this out-dated (compiled in) BPF-loader.  Examples
jumping through hoops to get XDP + TC to collaborate[1], and dealing
with iproute2 map-elf layout[2].

Thus, IMHO we MUST move forward and get started with converting
iproute2 to libbpf, and start on the work to deprecate the build in
BPF-ELF-loader.  I would prefer ripping out the BPF-ELF-loader and
replace it with libbpf that handle the older binary elf-map layout, but
I do understand if you want to keep this around. (at least for the next
couple of releases).

Maybe we can get a little closer to what Alexei wants?

When compiled against dynamic libbpf, then I would use 'ldd' command to
see what libbpf lib version is used.  When compiled/linked statically
against a custom libbpf version (already supported via LIBBPF_DIR) then
*I* think is difficult to figure out that version of libbpf I'm using.
Could we add the libbpf version info in 'tc -V', as then it would
remove one of my concerns with static linking.

I actually fear that it will be a bad user experience, when we start to
have multiple userspace tools that load BPF, but each is compiled and
statically linked with it own version of libbpf (with git submodule an
increasing number of tools will have more variations!).  Small
variations in supported features can cause strange and difficult
troubleshooting. A practical example is xdp-cpumap-tc[1] where I had to
instruct the customer to load XDP-program *BEFORE* TC-program to have map
(that is shared between TC and XDP) being created correctly, for
userspace tool written in libbpf to have proper map-access and info.


I actually thinks it makes sense to have iproute2 require a specific
libbpf version, and also to move this version requirement forward, as
the kernel evolves features that gets added into libbpf.  I know this
is kind of controversial, and an attempt to pressure distro vendors to
update libbpf.  Maybe it will actually backfire, as the person
generating the DEB/RPM software package will/can choose to compile
iproute2 without ELF-BPF/libbpf support.


[1] https://github.com/xdp-project/xdp-cpumap-tc
[2] https://github.com/netoptimizer/bpf-examples/blob/71db45b28ec/traffic-pacing-edt/edt_pacer02.c#L33-L35
Stephen Hemminger Nov. 16, 2020, 4:45 p.m. UTC | #3
On Sun, 15 Nov 2020 23:19:26 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Sun, Nov 15, 2020 at 10:56 PM Hangbin Liu <haliu@redhat.com> wrote:
> >
> > This series converts iproute2 to use libbpf for loading and attaching
> > BPF programs when it is available. This means that iproute2 will
> > correctly process BTF information and support the new-style BTF-defined
> > maps, while keeping compatibility with the old internal map definition
> > syntax.
> >
> > This is achieved by checking for libbpf at './configure' time, and using
> > it if available. By default the system libbpf will be used, but static
> > linking against a custom libbpf version can be achieved by passing
> > LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
> > abort if no suitable libbpf is found (useful for automatic packaging
> > that wants to enforce the dependency), or set off to disable libbpf check
> > and build iproute2 with legacy bpf.
> >
> > The old iproute2 bpf code is kept and will be used if no suitable libbpf
> > is available. When using libbpf, wrapper code ensures that iproute2 will
> > still understand the old map definition format, including populating
> > map-in-map and tail call maps before load.
> >
> > The examples in bpf/examples are kept, and a separate set of examples
> > are added with BTF-based map definitions for those examples where this
> > is possible (libbpf doesn't currently support declaratively populating
> > tail call maps).
> >
> > At last, Thanks a lot for Toke's help on this patch set.
> >
> > v5:
> > a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR
> >    dest.
> > b) Fix bpf_prog_load_dev typo.
> > c) rebase to latest iproute2-next.  
> 
> For the reasons explained multiple times earlier:
> Nacked-by: Alexei Starovoitov <ast@kernel.org>

Could you propose a trial balloon patch to show what you would like to see in iproute2?
Toke Høiland-Jørgensen Nov. 16, 2020, 11:29 p.m. UTC | #4
Jesper Dangaard Brouer <brouer@redhat.com> writes:

> When compiled against dynamic libbpf, then I would use 'ldd' command to
> see what libbpf lib version is used.  When compiled/linked statically
> against a custom libbpf version (already supported via LIBBPF_DIR) then
> *I* think is difficult to figure out that version of libbpf I'm using.
> Could we add the libbpf version info in 'tc -V', as then it would
> remove one of my concerns with static linking.

Agreed, I think we should definitely add the libbpf version to the tool
version output.

-Toke
Alexei Starovoitov Nov. 17, 2020, 2:37 a.m. UTC | #5
On Mon, Nov 16, 2020 at 03:54:46PM +0100, Jesper Dangaard Brouer wrote:
> 
> Thus, IMHO we MUST move forward and get started with converting
> iproute2 to libbpf, and start on the work to deprecate the build in
> BPF-ELF-loader.  I would prefer ripping out the BPF-ELF-loader and
> replace it with libbpf that handle the older binary elf-map layout, but
> I do understand if you want to keep this around. (at least for the next
> couple of releases).

I don't understand why legacy code has to be around.
Having the legacy code and an option to build tc without libbpf creates
backward compatibility risk to tc users:
Newer tc may not load bpf progs that older tc did.

> I actually fear that it will be a bad user experience, when we start to
> have multiple userspace tools that load BPF, but each is compiled and
> statically linked with it own version of libbpf (with git submodule an
> increasing number of tools will have more variations!).

So far people either freeze bpftool that they use to load progs
or they use libbpf directly in their applications.
Any other way means that the application behavior will be unpredictable.
If a company built a bpf-based product and wants to distibute such
product as a package it needs a way to specify this dependency in pkg config.
'tc -V' is not something that can be put in a spec.
The main iproute2 version can be used as a dependency, but it's meaningless
when presence of libbpf and its version is not strictly derived from
iproute2 spec.
The users should be able to write in their spec:
BuildRequires: iproute-tc >= 5.10
and be confident that tc will load the prog they've developed and tested.

> I actually thinks it makes sense to have iproute2 require a specific
> libbpf version, and also to move this version requirement forward, as
> the kernel evolves features that gets added into libbpf. 

+1
Hangbin Liu Nov. 17, 2020, 3:19 a.m. UTC | #6
On Mon, Nov 16, 2020 at 06:37:57PM -0800, Alexei Starovoitov wrote:
> On Mon, Nov 16, 2020 at 03:54:46PM +0100, Jesper Dangaard Brouer wrote:
> > 
> > Thus, IMHO we MUST move forward and get started with converting
> > iproute2 to libbpf, and start on the work to deprecate the build in
> > BPF-ELF-loader.  I would prefer ripping out the BPF-ELF-loader and
> > replace it with libbpf that handle the older binary elf-map layout, but
> > I do understand if you want to keep this around. (at least for the next
> > couple of releases).
> 
> I don't understand why legacy code has to be around.
> Having the legacy code and an option to build tc without libbpf creates
> backward compatibility risk to tc users:
> Newer tc may not load bpf progs that older tc did.

If a distro choose to compile iproute2 with libbpf, I don't think they will
compile iproute2 without libbpf in new version. So yum/apt-get update from
official source doesn't like a problem.

Unless a user choose to use a self build iproute2 version. Then the self build
version may also don't have other supports, like libelf, libnml, libcap etc.

> 
> > I actually fear that it will be a bad user experience, when we start to
> > have multiple userspace tools that load BPF, but each is compiled and
> > statically linked with it own version of libbpf (with git submodule an
> > increasing number of tools will have more variations!).
> 
> So far people either freeze bpftool that they use to load progs
> or they use libbpf directly in their applications.
> Any other way means that the application behavior will be unpredictable.
> If a company built a bpf-based product and wants to distibute such
> product as a package it needs a way to specify this dependency in pkg config.
> 'tc -V' is not something that can be put in a spec.
> The main iproute2 version can be used as a dependency, but it's meaningless
> when presence of libbpf and its version is not strictly derived from
> iproute2 spec.
> The users should be able to write in their spec:
> BuildRequires: iproute-tc >= 5.10
> and be confident that tc will load the prog they've developed and tested.

The current patch does have a libbpf version check, it need at least libbpf
0.1.0. So if a distro starts to build iproute2 based on libbpf, there will
have a dependence. The rule could be added to rpm spec file, or what else
the distro choose. That's the distro compiler's work.

Unless you want to say a company built a bpf-based product, they only
add iproute2 version dependence(let's say some distros has iproute2 5.12 with
libbpf supported), and somehow forgot add libbpf version dependence check
and distro check. At the same time a user run the product on a distro without
libbpf compiled on iproute2 5.12. That do will cause problem.

But if I'm the user, I will think the company is not professional for bpf
product that they even do not know libbpf is needed...

So my opinion: for end user, the distro should take care of libbpf and
iproute2 version control. For bpf company, they should take care if libbpf
is used by the iproute2 and what distros they support.

Please correct me if I missed something.

Thanks
Hangbin
David Ahern Nov. 17, 2020, 3:38 a.m. UTC | #7
On 11/16/20 7:54 AM, Jesper Dangaard Brouer wrote:
> When compiled against dynamic libbpf, then I would use 'ldd' command to
> see what libbpf lib version is used.  When compiled/linked statically
> against a custom libbpf version (already supported via LIBBPF_DIR) then
> *I* think is difficult to figure out that version of libbpf I'm using.
> Could we add the libbpf version info in 'tc -V', as then it would
> remove one of my concerns with static linking.

Adding libbpf version to 'tc -V' and 'ip -V' seems reasonable.

As for the bigger problem, trying to force user space components to
constantly chase latest and greatest S/W versions is not the right answer.

The crux of the problem here is loading bpf object files and what will
most likely be a never ending stream of enhancements that impact the
proper loading of them. bpftool is much more suited to the job of
managing bpf files versus iproute2 which is the de facto implementation
for networking APIs. bpftool ships as part of a common linux tools
package, so it will naturally track kernel versions for those who want /
need latest and greatest versions. Users who are not building their own
agents for managing bpf files (which I think is much more appropriate
for production use cases than forking command line utilities) can use
bpftool to load files, manage maps which are then attached to the
programs, etc, and then invoke iproute2 to handle the networking attach
/ detach / list with detailed information.

That said, the legacy bpf code in iproute2 has created some
expectations, and iproute2 can not simply remove existing capabilities.
Moving iproute2 to libbpf provides an improvement over the current
status by allowing ‘modern’ bpf object files to be loaded without
affecting legacy users, even if it does not allow latest and greatest
bpf capabilities at every moment in time (again, a constantly moving
reference point).

iproute2 is a networking configuration tool, not a bpf management tool.
Hangbin’s approach gives full flexibility to those who roll their own
and for distributions who value stability, it allows iproute2 to use
latest and greatest libbpf for those who want to chase the pot of gold
at the end of the rainbow, or they can choose stability with an OS
distro’s libbpf or legacy bpf. I believe this is the right compromise at
this point in time.
Edward Cree Nov. 17, 2020, 11:56 a.m. UTC | #8
On 17/11/2020 02:37, Alexei Starovoitov wrote:
> If a company built a bpf-based product and wants to distibute such
> product as a package it needs a way to specify this dependency in pkg config.
> 'tc -V' is not something that can be put in a spec.
> The main iproute2 version can be used as a dependency, but it's meaningless
> when presence of libbpf and its version is not strictly derived from
> iproute2 spec.

But if libbpf is dynamically linked, they can put
Requires: libbpf >= 0.3.0
Requires: iproute-tc >= 5.10
and get the dependency behaviour they need.  No?

-ed
Alexei Starovoitov Nov. 17, 2020, 6:19 p.m. UTC | #9
On Mon, Nov 16, 2020 at 08:38:15PM -0700, David Ahern wrote:
> 
> As for the bigger problem, trying to force user space components to
> constantly chase latest and greatest S/W versions is not the right answer.

Your own nexthop enhancements in the kernel code follow 1-1 with iproute2
changes. So the users do chase the latest kernel and the latest iproute2
if they want the networking feature.
Yet you're arguing that for bpf features they shouldn't have such expectations
with iproute2 which will not support the latest kernel bpf features.
I sense a lot of bias here.

> The crux of the problem here is loading bpf object files and what will
> most likely be a never ending stream of enhancements that impact the
> proper loading of them.

Please stop this misinformation spread.
Multiple people explained numerous times that libbpf takes care of
backward compatibility.

> That said, the legacy bpf code in iproute2 has created some
> expectations, and iproute2 can not simply remove existing capabilities.

It certainly can remove them by moving to libbpf.

> iproute2 is a networking configuration tool, not a bpf management tool.
> Hangbin’s approach gives full flexibility to those who roll their own
> and for distributions who value stability, it allows iproute2 to use
> latest and greatest libbpf for those who want to chase the pot of gold
> at the end of the rainbow, or they can choose stability with an OS
> distro’s libbpf or legacy bpf. I believe this is the right compromise at
> this point in time.

In other words you're saying that upstream iproute2 is a kitchen sink
of untested combinations of libraries and distros suppose to do a ton
of extra work to provide their users a quality iproute2.
Alexei Starovoitov Nov. 17, 2020, 6:27 p.m. UTC | #10
On Tue, Nov 17, 2020 at 11:19:33AM +0800, Hangbin Liu wrote:
> On Mon, Nov 16, 2020 at 06:37:57PM -0800, Alexei Starovoitov wrote:
> > On Mon, Nov 16, 2020 at 03:54:46PM +0100, Jesper Dangaard Brouer wrote:
> > > 
> > > Thus, IMHO we MUST move forward and get started with converting
> > > iproute2 to libbpf, and start on the work to deprecate the build in
> > > BPF-ELF-loader.  I would prefer ripping out the BPF-ELF-loader and
> > > replace it with libbpf that handle the older binary elf-map layout, but
> > > I do understand if you want to keep this around. (at least for the next
> > > couple of releases).
> > 
> > I don't understand why legacy code has to be around.
> > Having the legacy code and an option to build tc without libbpf creates
> > backward compatibility risk to tc users:
> > Newer tc may not load bpf progs that older tc did.
> 
> If a distro choose to compile iproute2 with libbpf, I don't think they will
> compile iproute2 without libbpf in new version. So yum/apt-get update from
> official source doesn't like a problem.
> 
> Unless a user choose to use a self build iproute2 version. Then the self build
> version may also don't have other supports, like libelf, libnml, libcap etc.
> 
> > 
> > > I actually fear that it will be a bad user experience, when we start to
> > > have multiple userspace tools that load BPF, but each is compiled and
> > > statically linked with it own version of libbpf (with git submodule an
> > > increasing number of tools will have more variations!).
> > 
> > So far people either freeze bpftool that they use to load progs
> > or they use libbpf directly in their applications.
> > Any other way means that the application behavior will be unpredictable.
> > If a company built a bpf-based product and wants to distibute such
> > product as a package it needs a way to specify this dependency in pkg config.
> > 'tc -V' is not something that can be put in a spec.
> > The main iproute2 version can be used as a dependency, but it's meaningless
> > when presence of libbpf and its version is not strictly derived from
> > iproute2 spec.
> > The users should be able to write in their spec:
> > BuildRequires: iproute-tc >= 5.10
> > and be confident that tc will load the prog they've developed and tested.
> 
> The current patch does have a libbpf version check, it need at least libbpf
> 0.1.0. So if a distro starts to build iproute2 based on libbpf, there will
> have a dependence. The rule could be added to rpm spec file, or what else
> the distro choose. That's the distro compiler's work.
> 
> Unless you want to say a company built a bpf-based product, they only
> add iproute2 version dependence(let's say some distros has iproute2 5.12 with
> libbpf supported), and somehow forgot add libbpf version dependence check
> and distro check. At the same time a user run the product on a distro without
> libbpf compiled on iproute2 5.12. That do will cause problem.

right.
You've answered Ed's question:

> But if libbpf is dynamically linked, they can put
> Requires: libbpf >= 0.3.0
> Requires: iproute-tc >= 5.10
> and get the dependency behaviour they need.  No?

It is a problem because >= 5.10 cannot capture legacy vs libbpf.

> But if I'm the user, I will think the company is not professional for bpf
> product that they even do not know libbpf is needed...
> 
> So my opinion: for end user, the distro should take care of libbpf and
> iproute2 version control. For bpf company, they should take care if libbpf
> is used by the iproute2 and what distros they support.

So you're saying that bpf community shouldn't care about their users.
The distros suppose to step forward and provide proper bpf support
in tools like iproute2?
In other words iproute2 upstream doesn't care about shipping quality product.
It's distros job now.
Thanks, but no.
iproute2 should stay with legacy obsolete prog loader
and the users should switch to bpftool + iproute2 combination.
bpftool for loading progs and iproute2 for networking configs.