mbox series

[bpf-next,v3,0/7] Implement queue/stack maps

Message ID 153986856416.9127.9618539079636149043.stgit@kernel
Headers show
Series Implement queue/stack maps | expand

Message

Mauricio Vasquez Oct. 18, 2018, 1:16 p.m. UTC
In some applications this is needed have a pool of free elements, for
example the list of free L4 ports in a SNAT.  None of the current maps allow
to do it as it is not possible to get any element without having they key
it is associated to, even if it were possible, the lack of locking mecanishms in
eBPF would do it almost impossible to be implemented without data races.

This patchset implements two new kind of eBPF maps: queue and stack.
Those maps provide to eBPF programs the peek, push and pop operations, and for
userspace applications a new bpf_map_lookup_and_delete_elem() is added.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>

v2 -> v3:
 - Remove "almost dead code" in syscall.c
 - Remove unnecessary copy_from_user in bpf_map_lookup_and_delete_elem
 - Rebase

v1 -> v2:
 - Put ARG_PTR_TO_UNINIT_MAP_VALUE logic into a separated patch
 - Fix missing __this_cpu_dec & preempt_enable calls in kernel/bpf/syscall.c

RFC v4 -> v1:
 - Remove roundup to power of 2 in memory allocation
 - Remove count and use a free slot to check if queue/stack is empty
 - Use if + assigment for wrapping indexes
 - Fix some minor style issues
 - Squash two patches together

RFC v3 -> RFC v4:
 - Revert renaming of kernel/bpf/stackmap.c
 - Remove restriction on value size
 - Remove len arguments from peek/pop helpers
 - Add new ARG_PTR_TO_UNINIT_MAP_VALUE

RFC v2 -> RFC v3:
 - Return elements by value instead that by reference
 - Implement queue/stack base on array and head + tail indexes
 - Rename stack trace related files to avoid confusion and conflicts

RFC v1 -> RFC v2:
 - Create two separate maps instead of single one + flags
 - Implement bpf_map_lookup_and_delete syscall
 - Support peek operation
 - Define replacement policy through flags in the update() method
 - Add eBPF side tests

---

Mauricio Vasquez B (7):
      bpf: rename stack trace map operations
      bpf/syscall: allow key to be null in map functions
      bpf/verifier: add ARG_PTR_TO_UNINIT_MAP_VALUE
      bpf: add queue and stack maps
      bpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall
      Sync uapi/bpf.h to tools/include
      selftests/bpf: add test cases for queue and stack maps


 include/linux/bpf.h                                |    7 
 include/linux/bpf_types.h                          |    4 
 include/uapi/linux/bpf.h                           |   30 ++
 kernel/bpf/Makefile                                |    2 
 kernel/bpf/core.c                                  |    3 
 kernel/bpf/helpers.c                               |   43 +++
 kernel/bpf/queue_stack_maps.c                      |  288 ++++++++++++++++++++
 kernel/bpf/stackmap.c                              |    2 
 kernel/bpf/syscall.c                               |   91 ++++++
 kernel/bpf/verifier.c                              |   28 ++
 net/core/filter.c                                  |    6 
 tools/include/uapi/linux/bpf.h                     |   30 ++
 tools/lib/bpf/bpf.c                                |   12 +
 tools/lib/bpf/bpf.h                                |    2 
 tools/testing/selftests/bpf/Makefile               |    5 
 tools/testing/selftests/bpf/bpf_helpers.h          |    7 
 tools/testing/selftests/bpf/test_maps.c            |  122 ++++++++
 tools/testing/selftests/bpf/test_progs.c           |   99 +++++++
 tools/testing/selftests/bpf/test_queue_map.c       |    4 
 tools/testing/selftests/bpf/test_queue_stack_map.h |   59 ++++
 tools/testing/selftests/bpf/test_stack_map.c       |    4 
 21 files changed, 834 insertions(+), 14 deletions(-)
 create mode 100644 kernel/bpf/queue_stack_maps.c
 create mode 100644 tools/testing/selftests/bpf/test_queue_map.c
 create mode 100644 tools/testing/selftests/bpf/test_queue_stack_map.h
 create mode 100644 tools/testing/selftests/bpf/test_stack_map.c

--

Comments

Daniel Borkmann Oct. 19, 2018, 8:08 p.m. UTC | #1
On 10/18/2018 03:16 PM, Mauricio Vasquez B wrote:
> In some applications this is needed have a pool of free elements, for
> example the list of free L4 ports in a SNAT.  None of the current maps allow
> to do it as it is not possible to get any element without having they key
> it is associated to, even if it were possible, the lack of locking mecanishms in
> eBPF would do it almost impossible to be implemented without data races.
> 
> This patchset implements two new kind of eBPF maps: queue and stack.
> Those maps provide to eBPF programs the peek, push and pop operations, and for
> userspace applications a new bpf_map_lookup_and_delete_elem() is added.
> 
> Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
> 
> v2 -> v3:
>  - Remove "almost dead code" in syscall.c
>  - Remove unnecessary copy_from_user in bpf_map_lookup_and_delete_elem
>  - Rebase
> 
> v1 -> v2:
>  - Put ARG_PTR_TO_UNINIT_MAP_VALUE logic into a separated patch
>  - Fix missing __this_cpu_dec & preempt_enable calls in kernel/bpf/syscall.c
> 
> RFC v4 -> v1:
>  - Remove roundup to power of 2 in memory allocation
>  - Remove count and use a free slot to check if queue/stack is empty
>  - Use if + assigment for wrapping indexes
>  - Fix some minor style issues
>  - Squash two patches together
> 
> RFC v3 -> RFC v4:
>  - Revert renaming of kernel/bpf/stackmap.c
>  - Remove restriction on value size
>  - Remove len arguments from peek/pop helpers
>  - Add new ARG_PTR_TO_UNINIT_MAP_VALUE
> 
> RFC v2 -> RFC v3:
>  - Return elements by value instead that by reference
>  - Implement queue/stack base on array and head + tail indexes
>  - Rename stack trace related files to avoid confusion and conflicts
> 
> RFC v1 -> RFC v2:
>  - Create two separate maps instead of single one + flags
>  - Implement bpf_map_lookup_and_delete syscall
>  - Support peek operation
>  - Define replacement policy through flags in the update() method
>  - Add eBPF side tests
> 
> ---
> 
> Mauricio Vasquez B (7):
>       bpf: rename stack trace map operations
>       bpf/syscall: allow key to be null in map functions
>       bpf/verifier: add ARG_PTR_TO_UNINIT_MAP_VALUE
>       bpf: add queue and stack maps
>       bpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall
>       Sync uapi/bpf.h to tools/include
>       selftests/bpf: add test cases for queue and stack maps
> 
> 
>  include/linux/bpf.h                                |    7 
>  include/linux/bpf_types.h                          |    4 
>  include/uapi/linux/bpf.h                           |   30 ++
>  kernel/bpf/Makefile                                |    2 
>  kernel/bpf/core.c                                  |    3 
>  kernel/bpf/helpers.c                               |   43 +++
>  kernel/bpf/queue_stack_maps.c                      |  288 ++++++++++++++++++++
>  kernel/bpf/stackmap.c                              |    2 
>  kernel/bpf/syscall.c                               |   91 ++++++
>  kernel/bpf/verifier.c                              |   28 ++
>  net/core/filter.c                                  |    6 
>  tools/include/uapi/linux/bpf.h                     |   30 ++
>  tools/lib/bpf/bpf.c                                |   12 +
>  tools/lib/bpf/bpf.h                                |    2 
>  tools/testing/selftests/bpf/Makefile               |    5 
>  tools/testing/selftests/bpf/bpf_helpers.h          |    7 
>  tools/testing/selftests/bpf/test_maps.c            |  122 ++++++++
>  tools/testing/selftests/bpf/test_progs.c           |   99 +++++++
>  tools/testing/selftests/bpf/test_queue_map.c       |    4 
>  tools/testing/selftests/bpf/test_queue_stack_map.h |   59 ++++
>  tools/testing/selftests/bpf/test_stack_map.c       |    4 
>  21 files changed, 834 insertions(+), 14 deletions(-)
>  create mode 100644 kernel/bpf/queue_stack_maps.c
>  create mode 100644 tools/testing/selftests/bpf/test_queue_map.c
>  create mode 100644 tools/testing/selftests/bpf/test_queue_stack_map.h
>  create mode 100644 tools/testing/selftests/bpf/test_stack_map.c
> 
> --
> 

Series:

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Alexei Starovoitov Oct. 19, 2018, 8:30 p.m. UTC | #2
On Fri, Oct 19, 2018 at 10:08:08PM +0200, Daniel Borkmann wrote:
> On 10/18/2018 03:16 PM, Mauricio Vasquez B wrote:
> > In some applications this is needed have a pool of free elements, for
> > example the list of free L4 ports in a SNAT.  None of the current maps allow
> > to do it as it is not possible to get any element without having they key
> > it is associated to, even if it were possible, the lack of locking mecanishms in
> > eBPF would do it almost impossible to be implemented without data races.
> > 
> > This patchset implements two new kind of eBPF maps: queue and stack.
> > Those maps provide to eBPF programs the peek, push and pop operations, and for
> > userspace applications a new bpf_map_lookup_and_delete_elem() is added.
> > 
> > Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>

Applied, Thanks