mbox series

[bpf,v4,0/2] bpf: change uapi for bpf iterator map elements

Message ID 20200805055056.1457388-1-yhs@fb.com
Headers show
Series bpf: change uapi for bpf iterator map elements | expand

Message

Yonghong Song Aug. 5, 2020, 5:50 a.m. UTC
Andrii raised a concern that current uapi for bpf iterator map
element is a little restrictive and not suitable for future potential
complex customization. This is a valid suggestion, considering people
may indeed add more complex custimization to the iterator, e.g.,
cgroup_id + user_id, etc. for task or task_file. Another example might
be map_id plus additional control so that the bpf iterator may bail
out a bucket earlier if a bucket has too many elements which may hold
lock too long and impact other parts of systems.

Patch #1 modified uapi with kernel changes. Patch #2
adjusted libbpf api accordingly.

Changelogs:
  v3 -> v4:
    . add a forward declaration of bpf_iter_link_info in
      tools/lib/bpf/bpf.h in case that libbpf is built against
      not-latest uapi bpf.h.
    . target the patch set to "bpf" instead of "bpf-next"
  v2 -> v3:
    . undo "not reject iter_info.map.map_fd == 0" from v1.
      In the future map_fd may become optional, so let us use map_fd == 0
      indicating the map_fd is not set by user space.
    . add link_info_len to bpf_iter_attach_opts to ensure always correct
      link_info_len from user. Otherwise, libbpf may deduce incorrect
      link_info_len if it uses different uapi header than the user app.
  v1 -> v2:
    . ensure link_create target_fd/flags == 0 since they are not used. (Andrii)
    . if either of iter_info ptr == 0 or iter_info_len == 0, but not both,
      return error to user space. (Andrii)
    . do not reject iter_info.map.map_fd == 0, go ahead to use it trying to
      get a map reference since the map_fd is required for map_elem iterator.
    . use bpf_iter_link_info in bpf_iter_attach_opts instead of map_fd.
      this way, user space is responsible to set up bpf_iter_link_info and
      libbpf just passes the data to the kernel, simplifying libbpf design.
      (Andrii)

Yonghong Song (2):
  bpf: change uapi for bpf iterator map elements
  tools/bpf: support new uapi for map element bpf iterator

 include/linux/bpf.h                           | 10 ++--
 include/uapi/linux/bpf.h                      | 15 ++---
 kernel/bpf/bpf_iter.c                         | 58 +++++++++----------
 kernel/bpf/map_iter.c                         | 37 +++++++++---
 kernel/bpf/syscall.c                          |  2 +-
 net/core/bpf_sk_storage.c                     | 37 +++++++++---
 tools/bpf/bpftool/iter.c                      |  9 ++-
 tools/include/uapi/linux/bpf.h                | 15 ++---
 tools/lib/bpf/bpf.c                           |  3 +
 tools/lib/bpf/bpf.h                           |  5 +-
 tools/lib/bpf/libbpf.c                        |  6 +-
 tools/lib/bpf/libbpf.h                        |  5 +-
 .../selftests/bpf/prog_tests/bpf_iter.c       | 40 ++++++++++---
 13 files changed, 160 insertions(+), 82 deletions(-)

Comments

Alexei Starovoitov Aug. 6, 2020, 11:46 p.m. UTC | #1
On Tue, Aug 4, 2020 at 10:51 PM Yonghong Song <yhs@fb.com> wrote:
>
> Andrii raised a concern that current uapi for bpf iterator map
> element is a little restrictive and not suitable for future potential
> complex customization. This is a valid suggestion, considering people
> may indeed add more complex custimization to the iterator, e.g.,
> cgroup_id + user_id, etc. for task or task_file. Another example might
> be map_id plus additional control so that the bpf iterator may bail
> out a bucket earlier if a bucket has too many elements which may hold
> lock too long and impact other parts of systems.
>
> Patch #1 modified uapi with kernel changes. Patch #2
> adjusted libbpf api accordingly.
>
> Changelogs:
>   v3 -> v4:
>     . add a forward declaration of bpf_iter_link_info in
>       tools/lib/bpf/bpf.h in case that libbpf is built against
>       not-latest uapi bpf.h.
>     . target the patch set to "bpf" instead of "bpf-next"

Applied. Thanks