mbox series

[bpf-next,v3,0/2] bpf: change uapi for bpf iterator map elements

Message ID 20200803224340.2925417-1-yhs@fb.com
Headers show
Series bpf: change uapi for bpf iterator map elements | expand

Message

Yonghong Song Aug. 3, 2020, 10:43 p.m. UTC
Andrii raised a concern that current uapi for bpf iterator map
element is a little restrictive and not suitable for future potential
complex customization. This is a valid suggestion, considering people
may indeed add more complex custimization to the iterator, e.g.,
cgroup_id + user_id, etc. for task or task_file. Another example might
be map_id plus additional control so that the bpf iterator may bail
out a bucket earlier if a bucket has too many elements which may hold
lock too long and impact other parts of systems.

Patch #1 modified uapi with kernel changes. Patch #2
adjusted libbpf api accordingly.

Changelogs:
  v2 -> v3:
    . undo "not reject iter_info.map.map_fd == 0" from v1.
      In the future map_fd may become optional, so let us use map_fd == 0
      indicating the map_fd is not set by user space.
    . add link_info_len to bpf_iter_attach_opts to ensure always correct
      link_info_len from user. Otherwise, libbpf may deduce incorrect
      link_info_len if it uses different uapi header than the user app.
  v1 -> v2:
    . ensure link_create target_fd/flags == 0 since they are not used. (Andrii)
    . if either of iter_info ptr == 0 or iter_info_len == 0, but not both,
      return error to user space. (Andrii)
    . do not reject iter_info.map.map_fd == 0, go ahead to use it trying to
      get a map reference since the map_fd is required for map_elem iterator.
    . use bpf_iter_link_info in bpf_iter_attach_opts instead of map_fd.
      this way, user space is responsible to set up bpf_iter_link_info and
      libbpf just passes the data to the kernel, simplifying libbpf design.
      (Andrii)

Yonghong Song (2):
  bpf: change uapi for bpf iterator map elements
  tools/bpf: support new uapi for map element bpf iterator

 include/linux/bpf.h                           | 10 ++--
 include/uapi/linux/bpf.h                      | 15 ++---
 kernel/bpf/bpf_iter.c                         | 58 +++++++++----------
 kernel/bpf/map_iter.c                         | 37 +++++++++---
 kernel/bpf/syscall.c                          |  2 +-
 net/core/bpf_sk_storage.c                     | 37 +++++++++---
 tools/bpf/bpftool/iter.c                      |  9 ++-
 tools/include/uapi/linux/bpf.h                | 15 ++---
 tools/lib/bpf/bpf.c                           |  3 +
 tools/lib/bpf/bpf.h                           |  4 +-
 tools/lib/bpf/libbpf.c                        |  6 +-
 tools/lib/bpf/libbpf.h                        |  5 +-
 .../selftests/bpf/prog_tests/bpf_iter.c       | 40 ++++++++++---
 13 files changed, 159 insertions(+), 82 deletions(-)

Comments

Andrii Nakryiko Aug. 3, 2020, 11:10 p.m. UTC | #1
On Mon, Aug 3, 2020 at 3:44 PM Yonghong Song <yhs@fb.com> wrote:
>
> Andrii raised a concern that current uapi for bpf iterator map
> element is a little restrictive and not suitable for future potential
> complex customization. This is a valid suggestion, considering people
> may indeed add more complex custimization to the iterator, e.g.,
> cgroup_id + user_id, etc. for task or task_file. Another example might
> be map_id plus additional control so that the bpf iterator may bail
> out a bucket earlier if a bucket has too many elements which may hold
> lock too long and impact other parts of systems.
>
> Patch #1 modified uapi with kernel changes. Patch #2
> adjusted libbpf api accordingly.
>
> Changelogs:
>   v2 -> v3:
>     . undo "not reject iter_info.map.map_fd == 0" from v1.
>       In the future map_fd may become optional, so let us use map_fd == 0
>       indicating the map_fd is not set by user space.
>     . add link_info_len to bpf_iter_attach_opts to ensure always correct
>       link_info_len from user. Otherwise, libbpf may deduce incorrect
>       link_info_len if it uses different uapi header than the user app.
>   v1 -> v2:
>     . ensure link_create target_fd/flags == 0 since they are not used. (Andrii)
>     . if either of iter_info ptr == 0 or iter_info_len == 0, but not both,
>       return error to user space. (Andrii)
>     . do not reject iter_info.map.map_fd == 0, go ahead to use it trying to
>       get a map reference since the map_fd is required for map_elem iterator.
>     . use bpf_iter_link_info in bpf_iter_attach_opts instead of map_fd.
>       this way, user space is responsible to set up bpf_iter_link_info and
>       libbpf just passes the data to the kernel, simplifying libbpf design.
>       (Andrii)
>
> Yonghong Song (2):
>   bpf: change uapi for bpf iterator map elements
>   tools/bpf: support new uapi for map element bpf iterator
>
>  include/linux/bpf.h                           | 10 ++--
>  include/uapi/linux/bpf.h                      | 15 ++---
>  kernel/bpf/bpf_iter.c                         | 58 +++++++++----------
>  kernel/bpf/map_iter.c                         | 37 +++++++++---
>  kernel/bpf/syscall.c                          |  2 +-
>  net/core/bpf_sk_storage.c                     | 37 +++++++++---
>  tools/bpf/bpftool/iter.c                      |  9 ++-
>  tools/include/uapi/linux/bpf.h                | 15 ++---
>  tools/lib/bpf/bpf.c                           |  3 +
>  tools/lib/bpf/bpf.h                           |  4 +-
>  tools/lib/bpf/libbpf.c                        |  6 +-
>  tools/lib/bpf/libbpf.h                        |  5 +-
>  .../selftests/bpf/prog_tests/bpf_iter.c       | 40 ++++++++++---
>  13 files changed, 159 insertions(+), 82 deletions(-)
>
> --
> 2.24.1
>


Looks great, thanks!

Acked-by: Andrii Nakryiko <andriin@fb.com>