mbox series

[RFC,bpf-next,v2,0/6] bpf: add BPF_MAP_DUMP command to

Message ID 20190627202417.33370-1-brianvv@google.com
Headers show
Series bpf: add BPF_MAP_DUMP command to | expand

Message

Brian Vazquez June 27, 2019, 8:24 p.m. UTC
This introduces a new command to retrieve a variable number of entries
from a bpf map.

This new command can be executed from the existing BPF syscall as
follows:

err =  bpf(BPF_MAP_DUMP, union bpf_attr *attr, u32 size)
using attr->dump.map_fd, attr->dump.prev_key, attr->dump.buf,
attr->dump.buf_len
returns zero or negative error, and populates buf and buf_len on
succees

This implementation is wrapping the existing bpf methods:
map_get_next_key and map_lookup_elem
the results show that even with a 1-elem_size buffer, it runs ~40 faster
than the current implementation, improvements of ~85% are reported when
the buffer size is increased, although, after the buffer size is around
5% of the total number of entries there's no huge difference in
increasing
it.

Tested:
Tried different size buffers to handle case where the bulk is bigger, or
the elements to retrieve are less than the existing ones, all runs read
a map of 100K entries. Below are the results(in ns) from the different
runs:

buf_len_1:       55528939 entry-by-entry: 97244981 improvement
42.897887%
buf_len_2:       34425779 entry-by-entry: 88863122 improvement
61.259769%
buf_len_230:     11700316 entry-by-entry: 88753301 improvement
86.817036%
buf_len_5000:    11615290 entry-by-entry: 88362637 improvement
86.854976%
buf_len_73000:   12083976 entry-by-entry: 89956483 improvement
86.566865%
buf_len_100000:  12638913 entry-by-entry: 89642303 improvement
85.900727%
buf_len_234567:  11873964 entry-by-entry: 89080077 improvement
86.670461%

Changelog:

v2:
- use proper bpf-next tag

Brian Vazquez (6):
  bpf: add bpf_map_value_size and bp_map_copy_value helper functions
  bpf: add BPF_MAP_DUMP command to access more than one entry per call
  bpf: keep bpf.h in sync with tools/
  libbpf: support BPF_MAP_DUMP command
  selftests/bpf: test BPF_MAP_DUMP command on a bpf hashmap
  selftests/bpf: add test to measure performance of BPF_MAP_DUMP

 include/uapi/linux/bpf.h                |   9 +
 kernel/bpf/syscall.c                    | 242 ++++++++++++++++++------
 tools/include/uapi/linux/bpf.h          |   9 +
 tools/lib/bpf/bpf.c                     |  28 +++
 tools/lib/bpf/bpf.h                     |   4 +
 tools/lib/bpf/libbpf.map                |   2 +
 tools/testing/selftests/bpf/test_maps.c | 141 +++++++++++++-
 7 files changed, 372 insertions(+), 63 deletions(-)

Comments

Alexei Starovoitov June 27, 2019, 10:14 p.m. UTC | #1
On Thu, Jun 27, 2019 at 01:24:11PM -0700, Brian Vazquez wrote:
> This introduces a new command to retrieve a variable number of entries
> from a bpf map.
> 
> This new command can be executed from the existing BPF syscall as
> follows:
> 
> err =  bpf(BPF_MAP_DUMP, union bpf_attr *attr, u32 size)
> using attr->dump.map_fd, attr->dump.prev_key, attr->dump.buf,
> attr->dump.buf_len
> returns zero or negative error, and populates buf and buf_len on
> succees
> 
> This implementation is wrapping the existing bpf methods:
> map_get_next_key and map_lookup_elem
> the results show that even with a 1-elem_size buffer, it runs ~40 faster
> than the current implementation, improvements of ~85% are reported when
> the buffer size is increased, although, after the buffer size is around
> 5% of the total number of entries there's no huge difference in
> increasing
> it.

was it with kpti and retpoline mitigations?
Brian Vazquez June 28, 2019, 5:50 p.m. UTC | #2
> was it with kpti and retpoline mitigations?

No, it wasn't. Will get back with new numbers.