From patchwork Wed Jan 15 18:43:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1223799 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=ebYO85PY; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47ybp63S5wz9sR1 for ; Thu, 16 Jan 2020 05:44:14 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729152AbgAOSn0 (ORCPT ); Wed, 15 Jan 2020 13:43:26 -0500 Received: from mail-qk1-f202.google.com ([209.85.222.202]:43707 "EHLO mail-qk1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729133AbgAOSnZ (ORCPT ); Wed, 15 Jan 2020 13:43:25 -0500 Received: by mail-qk1-f202.google.com with SMTP id f22so11268356qka.10 for ; Wed, 15 Jan 2020 10:43:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=Jc5+7d7NtiEYIwbVpgtYmSFqbGpKI69cS8An9zRo6DQ=; b=ebYO85PYKYyM1gQvISh1WFoTt54iljODvfFADMN5Lujvd3yN3fEeYdR01uCnKmJgtf q2q3vl1yi6XChSfva3Yq12nVfCCjuyHR9MbZtsysW6ZR5Vh4etOZBjtxL4n6cdHn/E7L ckeRzx0x04nPbys+opaGXqqWNuZPTDJghC9MTYqxIfyZGvvvKFr8PWGirvZCFxRcqOmm AM8urMOnV6Tjm6xOV9xT5AYk3PV28cPJCjp+Uy+6zqkHF7BwhEM1Cqxjjf6E+5ROTP/p Li9YUTNgD5xLY47+QZtLBwDbzSvfKxp23J2u6D5f+ALFXgyFRw4H3DXTF8YBgZQ5CYgx KnYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Jc5+7d7NtiEYIwbVpgtYmSFqbGpKI69cS8An9zRo6DQ=; b=Y0Pu8lSE91fQCRsQKsAwaIFhP6Dm60pe7/Fbk9cws8b6lND5lJryYFXJ5WVJUdRbaE THdNM8bioICihnSsmthlHFqgQkvuoJpp8kOerHuYkCxKI1sQnicpC/+lfZw9qfsZdKG3 h1Ng5Vhlj/C76YeBzSz2oTugc1Z/IrWNznJHBeYJNMpmwrv93QEXY8dFIQwcQs56cIYE 321HsuaKLql08SNRv2JijW6BSAwKLmF7mhootf9pkdzstMZgKD/UR5dphilygmxAQm61 2IIjCAt1Em9KWVzC3K/8Y8mmp4hb8t7l7VHBe212ClP/sJX9YEHESkLbglonUeSFigTm diTA== X-Gm-Message-State: APjAAAUG1Kd0C+r7pKnsWb06/idTpW3gHaSLOHnxCb06Zs3bHxG6RZ1S yjzJ9gZY9O3AMvghP+Y/fr/CVzhUV4vx X-Google-Smtp-Source: APXvYqxnQKeSsJ9tODh/EvmIWwDOBtYI5Kkst4SxmLTSzNn9pq8QcNJNXaITN/H0/GuDPScqENreNlxD5kfI X-Received: by 2002:a05:6214:1103:: with SMTP id e3mr23612085qvs.159.1579113804047; Wed, 15 Jan 2020 10:43:24 -0800 (PST) Date: Wed, 15 Jan 2020 10:43:00 -0800 In-Reply-To: <20200115184308.162644-1-brianvv@google.com> Message-Id: <20200115184308.162644-2-brianvv@google.com> Mime-Version: 1.0 References: <20200115184308.162644-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v5 bpf-next 1/9] bpf: add bpf_map_{value_size, update_value, map_copy_value} functions From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, John Fastabend Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This commit moves reusable code from map_lookup_elem and map_update_elem to avoid code duplication in kernel/bpf/syscall.c. Signed-off-by: Brian Vazquez Acked-by: John Fastabend Acked-by: Yonghong Song --- kernel/bpf/syscall.c | 280 +++++++++++++++++++++++-------------------- 1 file changed, 152 insertions(+), 128 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index f9db72a96ec04..08b0b6e40454b 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -129,6 +129,154 @@ static struct bpf_map *find_and_alloc_map(union bpf_attr *attr) return map; } +static u32 bpf_map_value_size(struct bpf_map *map) +{ + if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY || + map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) + return round_up(map->value_size, 8) * num_possible_cpus(); + else if (IS_FD_MAP(map)) + return sizeof(u32); + else + return map->value_size; +} + +static void maybe_wait_bpf_programs(struct bpf_map *map) +{ + /* Wait for any running BPF programs to complete so that + * userspace, when we return to it, knows that all programs + * that could be running use the new map value. + */ + if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS || + map->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS) + synchronize_rcu(); +} + +static int bpf_map_update_value(struct bpf_map *map, struct fd f, void *key, + void *value, __u64 flags) +{ + int err; + + /* Need to create a kthread, thus must support schedule */ + if (bpf_map_is_dev_bound(map)) { + return bpf_map_offload_update_elem(map, key, value, flags); + } else if (map->map_type == BPF_MAP_TYPE_CPUMAP || + map->map_type == BPF_MAP_TYPE_SOCKHASH || + map->map_type == BPF_MAP_TYPE_SOCKMAP || + map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { + return map->ops->map_update_elem(map, key, value, flags); + } else if (IS_FD_PROG_ARRAY(map)) { + return bpf_fd_array_map_update_elem(map, f.file, key, value, + flags); + } + + /* must increment bpf_prog_active to avoid kprobe+bpf triggering from + * inside bpf map update or delete otherwise deadlocks are possible + */ + preempt_disable(); + __this_cpu_inc(bpf_prog_active); + if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { + err = bpf_percpu_hash_update(map, key, value, flags); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { + err = bpf_percpu_array_update(map, key, value, flags); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { + err = bpf_percpu_cgroup_storage_update(map, key, value, + flags); + } else if (IS_FD_ARRAY(map)) { + rcu_read_lock(); + err = bpf_fd_array_map_update_elem(map, f.file, key, value, + flags); + rcu_read_unlock(); + } else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { + rcu_read_lock(); + err = bpf_fd_htab_map_update_elem(map, f.file, key, value, + flags); + rcu_read_unlock(); + } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { + /* rcu_read_lock() is not needed */ + err = bpf_fd_reuseport_array_update_elem(map, key, value, + flags); + } else if (map->map_type == BPF_MAP_TYPE_QUEUE || + map->map_type == BPF_MAP_TYPE_STACK) { + err = map->ops->map_push_elem(map, value, flags); + } else { + rcu_read_lock(); + err = map->ops->map_update_elem(map, key, value, flags); + rcu_read_unlock(); + } + __this_cpu_dec(bpf_prog_active); + preempt_enable(); + maybe_wait_bpf_programs(map); + + return err; +} + +static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value, + __u64 flags) +{ + void *ptr; + int err; + + if (bpf_map_is_dev_bound(map)) { + err = bpf_map_offload_lookup_elem(map, key, value); + return err; + } + + preempt_disable(); + this_cpu_inc(bpf_prog_active); + if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { + err = bpf_percpu_hash_copy(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { + err = bpf_percpu_array_copy(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { + err = bpf_percpu_cgroup_storage_copy(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) { + err = bpf_stackmap_copy(map, key, value); + } else if (IS_FD_ARRAY(map) || IS_FD_PROG_ARRAY(map)) { + err = bpf_fd_array_map_lookup_elem(map, key, value); + } else if (IS_FD_HASH(map)) { + err = bpf_fd_htab_map_lookup_elem(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { + err = bpf_fd_reuseport_array_lookup_elem(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_QUEUE || + map->map_type == BPF_MAP_TYPE_STACK) { + err = map->ops->map_peek_elem(map, value); + } else if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { + /* struct_ops map requires directly updating "value" */ + err = bpf_struct_ops_map_sys_lookup_elem(map, key, value); + } else { + rcu_read_lock(); + if (map->ops->map_lookup_elem_sys_only) + ptr = map->ops->map_lookup_elem_sys_only(map, key); + else + ptr = map->ops->map_lookup_elem(map, key); + if (IS_ERR(ptr)) { + err = PTR_ERR(ptr); + } else if (!ptr) { + err = -ENOENT; + } else { + err = 0; + if (flags & BPF_F_LOCK) + /* lock 'ptr' and copy everything but lock */ + copy_map_value_locked(map, value, ptr, true); + else + copy_map_value(map, value, ptr); + /* mask lock, since value wasn't zero inited */ + check_and_init_map_lock(map, value); + } + rcu_read_unlock(); + } + + this_cpu_dec(bpf_prog_active); + preempt_enable(); + maybe_wait_bpf_programs(map); + + return err; +} + static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable) { /* We really just want to fail instead of triggering OOM killer @@ -827,7 +975,7 @@ static int map_lookup_elem(union bpf_attr *attr) void __user *uvalue = u64_to_user_ptr(attr->value); int ufd = attr->map_fd; struct bpf_map *map; - void *key, *value, *ptr; + void *key, *value; u32 value_size; struct fd f; int err; @@ -859,75 +1007,14 @@ static int map_lookup_elem(union bpf_attr *attr) goto err_put; } - if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY || - map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) - value_size = round_up(map->value_size, 8) * num_possible_cpus(); - else if (IS_FD_MAP(map)) - value_size = sizeof(u32); - else - value_size = map->value_size; + value_size = bpf_map_value_size(map); err = -ENOMEM; value = kmalloc(value_size, GFP_USER | __GFP_NOWARN); if (!value) goto free_key; - if (bpf_map_is_dev_bound(map)) { - err = bpf_map_offload_lookup_elem(map, key, value); - goto done; - } - - preempt_disable(); - this_cpu_inc(bpf_prog_active); - if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { - err = bpf_percpu_hash_copy(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { - err = bpf_percpu_array_copy(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { - err = bpf_percpu_cgroup_storage_copy(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) { - err = bpf_stackmap_copy(map, key, value); - } else if (IS_FD_ARRAY(map) || IS_FD_PROG_ARRAY(map)) { - err = bpf_fd_array_map_lookup_elem(map, key, value); - } else if (IS_FD_HASH(map)) { - err = bpf_fd_htab_map_lookup_elem(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { - err = bpf_fd_reuseport_array_lookup_elem(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_QUEUE || - map->map_type == BPF_MAP_TYPE_STACK) { - err = map->ops->map_peek_elem(map, value); - } else if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { - /* struct_ops map requires directly updating "value" */ - err = bpf_struct_ops_map_sys_lookup_elem(map, key, value); - } else { - rcu_read_lock(); - if (map->ops->map_lookup_elem_sys_only) - ptr = map->ops->map_lookup_elem_sys_only(map, key); - else - ptr = map->ops->map_lookup_elem(map, key); - if (IS_ERR(ptr)) { - err = PTR_ERR(ptr); - } else if (!ptr) { - err = -ENOENT; - } else { - err = 0; - if (attr->flags & BPF_F_LOCK) - /* lock 'ptr' and copy everything but lock */ - copy_map_value_locked(map, value, ptr, true); - else - copy_map_value(map, value, ptr); - /* mask lock, since value wasn't zero inited */ - check_and_init_map_lock(map, value); - } - rcu_read_unlock(); - } - this_cpu_dec(bpf_prog_active); - preempt_enable(); - -done: + err = bpf_map_copy_value(map, key, value, attr->flags); if (err) goto free_value; @@ -946,16 +1033,6 @@ static int map_lookup_elem(union bpf_attr *attr) return err; } -static void maybe_wait_bpf_programs(struct bpf_map *map) -{ - /* Wait for any running BPF programs to complete so that - * userspace, when we return to it, knows that all programs - * that could be running use the new map value. - */ - if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS || - map->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS) - synchronize_rcu(); -} #define BPF_MAP_UPDATE_ELEM_LAST_FIELD flags @@ -1011,61 +1088,8 @@ static int map_update_elem(union bpf_attr *attr) if (copy_from_user(value, uvalue, value_size) != 0) goto free_value; - /* Need to create a kthread, thus must support schedule */ - if (bpf_map_is_dev_bound(map)) { - err = bpf_map_offload_update_elem(map, key, value, attr->flags); - goto out; - } else if (map->map_type == BPF_MAP_TYPE_CPUMAP || - map->map_type == BPF_MAP_TYPE_SOCKHASH || - map->map_type == BPF_MAP_TYPE_SOCKMAP || - map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { - err = map->ops->map_update_elem(map, key, value, attr->flags); - goto out; - } else if (IS_FD_PROG_ARRAY(map)) { - err = bpf_fd_array_map_update_elem(map, f.file, key, value, - attr->flags); - goto out; - } + err = bpf_map_update_value(map, f, key, value, attr->flags); - /* must increment bpf_prog_active to avoid kprobe+bpf triggering from - * inside bpf map update or delete otherwise deadlocks are possible - */ - preempt_disable(); - __this_cpu_inc(bpf_prog_active); - if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { - err = bpf_percpu_hash_update(map, key, value, attr->flags); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { - err = bpf_percpu_array_update(map, key, value, attr->flags); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { - err = bpf_percpu_cgroup_storage_update(map, key, value, - attr->flags); - } else if (IS_FD_ARRAY(map)) { - rcu_read_lock(); - err = bpf_fd_array_map_update_elem(map, f.file, key, value, - attr->flags); - rcu_read_unlock(); - } else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { - rcu_read_lock(); - err = bpf_fd_htab_map_update_elem(map, f.file, key, value, - attr->flags); - rcu_read_unlock(); - } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { - /* rcu_read_lock() is not needed */ - err = bpf_fd_reuseport_array_update_elem(map, key, value, - attr->flags); - } else if (map->map_type == BPF_MAP_TYPE_QUEUE || - map->map_type == BPF_MAP_TYPE_STACK) { - err = map->ops->map_push_elem(map, value, attr->flags); - } else { - rcu_read_lock(); - err = map->ops->map_update_elem(map, key, value, attr->flags); - rcu_read_unlock(); - } - __this_cpu_dec(bpf_prog_active); - preempt_enable(); - maybe_wait_bpf_programs(map); -out: free_value: kfree(value); free_key: From patchwork Wed Jan 15 18:43:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1223783 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=kmyws+ot; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47ybnJ1JzQz9sR1 for ; Thu, 16 Jan 2020 05:43:32 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729240AbgAOSna (ORCPT ); Wed, 15 Jan 2020 13:43:30 -0500 Received: from mail-pf1-f202.google.com ([209.85.210.202]:57211 "EHLO mail-pf1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729164AbgAOSn1 (ORCPT ); Wed, 15 Jan 2020 13:43:27 -0500 Received: by mail-pf1-f202.google.com with SMTP id h16so11439221pfn.23 for ; Wed, 15 Jan 2020 10:43:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=tz5xgOut9x3UI8UwL3xgmMSxaEDJNtFpPyGv6YNxIbw=; b=kmyws+otBIE9Ecpup9S5LsbuxRSQr1xp6+I1JzBwg+ja7qTXasp43mzqMiDWgRz3+x I+0uIQ6D9L6D2L1vwKKkQOmtWAsQCs2hblHhXCDcPZ7D4supA/jOE/s74vFuhLt5RU5S Q0yJ36NRWumr2ZdIKb/pKgQZJ+FltOatZ+CjvZSW/KUxS6V5deD0VwwXADueJXT6s7bo eiCyu+ygq0VnlNL/f8LJ9KY9bg42GlCN9ZqUFlExp0QkZO7MQbBu5G2w7M8OaOPqpv58 8Th1juLdZIPYEe6vIMroHh930VRav6nW7texgAeLu3pzOrzUsTV1Zf818DyY/KFhrTfn 6DQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=tz5xgOut9x3UI8UwL3xgmMSxaEDJNtFpPyGv6YNxIbw=; b=l1zsFF96PtlRnBmXWnY0JqSZ1wTGINi3iUtxmpolhn9Sn2POrbIXiGCZBcOL/eO4fE e1BXXOCCXviCKFX9AFnSHp+/Hg+KS12ctUHPO1iE0ffP5hoLREr8NlXOl9hLG0P9K+yU 7vpv9f/JYyZjylHZvNpu2bTJ+69G1iMYCF8F8NOZK5qE5Bp7hbQkaF4vPD5qEXxIwvyQ VKIpQ73Xxrehs1lWSWNvuJBDSPAOkXm/IZBw7dV99nsA4NsfCa3zzLlovZME9WMgz7nT hRVpRkkfu7n8TyJ1SmKqv0eBZ68nI78E/p7Pul90L2NH2ZdYIMyBIVfjg6/a3w7Yl/ml otsw== X-Gm-Message-State: APjAAAU7Ok8JM+xykOToQertZzstyArmwPJlm7AjpLc0cEClbOzfdBEk pgYAy3jFJTvS6GxycU3KJwMwa0mod3Bc X-Google-Smtp-Source: APXvYqxDDvVwRB34BB0+uLfkMzU6LBD3k4eXsiaEETIErWB8rUFiaIvQT/cUEKHnWq3Eg0rFghVlvCk090Ii X-Received: by 2002:a65:68d4:: with SMTP id k20mr35174972pgt.142.1579113806532; Wed, 15 Jan 2020 10:43:26 -0800 (PST) Date: Wed, 15 Jan 2020 10:43:01 -0800 In-Reply-To: <20200115184308.162644-1-brianvv@google.com> Message-Id: <20200115184308.162644-3-brianvv@google.com> Mime-Version: 1.0 References: <20200115184308.162644-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v5 bpf-next 2/9] bpf: add generic support for lookup batch op From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This commit introduces generic support for the bpf_map_lookup_batch. This implementation can be used by almost all the bpf maps since its core implementation is relying on the existing map_get_next_key and map_lookup_elem. The bpf syscall subcommand introduced is: BPF_MAP_LOOKUP_BATCH The UAPI attribute is: struct { /* struct used by BPF_MAP_*_BATCH commands */ __aligned_u64 in_batch; /* start batch, * NULL to start from beginning */ __aligned_u64 out_batch; /* output: next start batch */ __aligned_u64 keys; __aligned_u64 values; __u32 count; /* input/output: * input: # of key/value * elements * output: # of filled elements */ __u32 map_fd; __u64 elem_flags; __u64 flags; } batch; in_batch/out_batch are opaque values use to communicate between user/kernel space, in_batch/out_batch must be of key_size length. To start iterating from the beginning in_batch must be null, count is the # of key/value elements to retrieve. Note that the 'keys' buffer must be a buffer of key_size * count size and the 'values' buffer must be value_size * count, where value_size must be aligned to 8 bytes by userspace if it's dealing with percpu maps. 'count' will contain the number of keys/values successfully retrieved. Note that 'count' is an input/output variable and it can contain a lower value after a call. If there's no more entries to retrieve, ENOENT will be returned. If error is ENOENT, count might be > 0 in case it copied some values but there were no more entries to retrieve. Note that if the return code is an error and not -EFAULT, count indicates the number of elements successfully processed. Suggested-by: Stanislav Fomichev Signed-off-by: Brian Vazquez Signed-off-by: Yonghong Song --- include/linux/bpf.h | 5 ++ include/uapi/linux/bpf.h | 18 +++++ kernel/bpf/syscall.c | 160 ++++++++++++++++++++++++++++++++++++++- 3 files changed, 179 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index aed2bc39d72b6..807744ecaa5a1 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -44,6 +44,8 @@ struct bpf_map_ops { int (*map_get_next_key)(struct bpf_map *map, void *key, void *next_key); void (*map_release_uref)(struct bpf_map *map); void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key); + int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr); /* funcs callable from userspace and from eBPF programs */ void *(*map_lookup_elem)(struct bpf_map *map, void *key); @@ -982,6 +984,9 @@ void *bpf_map_area_alloc(u64 size, int numa_node); void *bpf_map_area_mmapable_alloc(u64 size, int numa_node); void bpf_map_area_free(void *base); void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr); +int generic_map_lookup_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr); extern int sysctl_unprivileged_bpf_disabled; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 52966e758fe59..8185f1542daa1 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -107,6 +107,7 @@ enum bpf_cmd { BPF_MAP_LOOKUP_AND_DELETE_ELEM, BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, + BPF_MAP_LOOKUP_BATCH, }; enum bpf_map_type { @@ -420,6 +421,23 @@ union bpf_attr { __u64 flags; }; + struct { /* struct used by BPF_MAP_*_BATCH commands */ + __aligned_u64 in_batch; /* start batch, + * NULL to start from beginning + */ + __aligned_u64 out_batch; /* output: next start batch */ + __aligned_u64 keys; + __aligned_u64 values; + __u32 count; /* input/output: + * input: # of key/value + * elements + * output: # of filled elements + */ + __u32 map_fd; + __u64 elem_flags; + __u64 flags; + } batch; + struct { /* anonymous struct used by BPF_PROG_LOAD command */ __u32 prog_type; /* one of enum bpf_prog_type */ __u32 insn_cnt; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 08b0b6e40454b..d604ddbb1afbe 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -219,10 +219,8 @@ static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value, void *ptr; int err; - if (bpf_map_is_dev_bound(map)) { - err = bpf_map_offload_lookup_elem(map, key, value); - return err; - } + if (bpf_map_is_dev_bound(map)) + return bpf_map_offload_lookup_elem(map, key, value); preempt_disable(); this_cpu_inc(bpf_prog_active); @@ -1220,6 +1218,109 @@ static int map_get_next_key(union bpf_attr *attr) return err; } +#define MAP_LOOKUP_RETRIES 3 + +int generic_map_lookup_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + void __user *uobatch = u64_to_user_ptr(attr->batch.out_batch); + void __user *ubatch = u64_to_user_ptr(attr->batch.in_batch); + void __user *values = u64_to_user_ptr(attr->batch.values); + void __user *keys = u64_to_user_ptr(attr->batch.keys); + void *buf, *buf_prevkey, *prev_key, *key, *value; + int err, retry = MAP_LOOKUP_RETRIES; + u32 value_size, cp, max_count; + bool first_key = false; + + if (attr->batch.elem_flags & ~BPF_F_LOCK) + return -EINVAL; + + if ((attr->batch.elem_flags & BPF_F_LOCK) && + !map_value_has_spin_lock(map)) + return -EINVAL; + + value_size = bpf_map_value_size(map); + + max_count = attr->batch.count; + if (!max_count) + return 0; + + if (put_user(0, &uattr->batch.count)) + return -EFAULT; + + buf_prevkey = kmalloc(map->key_size, GFP_USER | __GFP_NOWARN); + if (!buf_prevkey) + return -ENOMEM; + + buf = kmalloc(map->key_size + value_size, GFP_USER | __GFP_NOWARN); + if (!buf) { + kvfree(buf_prevkey); + return -ENOMEM; + } + + err = -EFAULT; + first_key = false; + prev_key = NULL; + if (ubatch && copy_from_user(buf_prevkey, ubatch, map->key_size)) + goto free_buf; + key = buf; + value = key + map->key_size; + if (ubatch) + prev_key = buf_prevkey; + + for (cp = 0; cp < max_count;) { + rcu_read_lock(); + err = map->ops->map_get_next_key(map, prev_key, key); + rcu_read_unlock(); + if (err) + break; + err = bpf_map_copy_value(map, key, value, + attr->batch.elem_flags); + + if (err == -ENOENT) { + if (retry) { + retry--; + continue; + } + err = -EINTR; + break; + } + + if (err) + goto free_buf; + + if (copy_to_user(keys + cp * map->key_size, key, + map->key_size)) { + err = -EFAULT; + goto free_buf; + } + if (copy_to_user(values + cp * value_size, value, value_size)) { + err = -EFAULT; + goto free_buf; + } + + if (!prev_key) + prev_key = buf_prevkey; + + swap(prev_key, key); + retry = MAP_LOOKUP_RETRIES; + cp++; + } + + if (err == -EFAULT) + goto free_buf; + + if ((copy_to_user(&uattr->batch.count, &cp, sizeof(cp)) || + (cp && copy_to_user(uobatch, prev_key, map->key_size)))) + err = -EFAULT; + +free_buf: + kfree(buf_prevkey); + kfree(buf); + return err; +} + #define BPF_MAP_LOOKUP_AND_DELETE_ELEM_LAST_FIELD value static int map_lookup_and_delete_elem(union bpf_attr *attr) @@ -3076,6 +3177,54 @@ static int bpf_task_fd_query(const union bpf_attr *attr, return err; } +#define BPF_MAP_BATCH_LAST_FIELD batch.flags + +#define BPF_DO_BATCH(fn) \ + do { \ + if (!fn) { \ + err = -ENOTSUPP; \ + goto err_put; \ + } \ + err = fn(map, attr, uattr); \ + } while (0) + +static int bpf_map_do_batch(const union bpf_attr *attr, + union bpf_attr __user *uattr, + int cmd) +{ + struct bpf_map *map; + int err, ufd; + struct fd f; + + if (CHECK_ATTR(BPF_MAP_BATCH)) + return -EINVAL; + + ufd = attr->batch.map_fd; + f = fdget(ufd); + map = __bpf_map_get(f); + if (IS_ERR(map)) + return PTR_ERR(map); + + if (cmd == BPF_MAP_LOOKUP_BATCH && + !(map_get_sys_perms(map, f) & FMODE_CAN_READ)) { + err = -EPERM; + goto err_put; + } + + if (cmd != BPF_MAP_LOOKUP_BATCH && + !(map_get_sys_perms(map, f) & FMODE_CAN_WRITE)) { + err = -EPERM; + goto err_put; + } + + if (cmd == BPF_MAP_LOOKUP_BATCH) + BPF_DO_BATCH(map->ops->map_lookup_batch); + +err_put: + fdput(f); + return err; +} + SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size) { union bpf_attr attr = {}; @@ -3173,6 +3322,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz case BPF_MAP_LOOKUP_AND_DELETE_ELEM: err = map_lookup_and_delete_elem(&attr); break; + case BPF_MAP_LOOKUP_BATCH: + err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH); + break; default: err = -EINVAL; break; From patchwork Wed Jan 15 18:43:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1223785 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=akYxcL9+; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47ybnM5Nqkz9sR8 for ; Thu, 16 Jan 2020 05:43:35 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729291AbgAOSne (ORCPT ); Wed, 15 Jan 2020 13:43:34 -0500 Received: from mail-pf1-f202.google.com ([209.85.210.202]:54132 "EHLO mail-pf1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729213AbgAOSna (ORCPT ); Wed, 15 Jan 2020 13:43:30 -0500 Received: by mail-pf1-f202.google.com with SMTP id k26so11437743pfp.20 for ; Wed, 15 Jan 2020 10:43:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=+HqP+4pv65tNa2gKmR3/Ufe3GB9e142iVvYjLgWUjzs=; b=akYxcL9+ITuQ5MSssRxGCnVoBxHKT823lD+O+bV4ruD6VuFQCLd8Mvz3zTqrZJLeuJ 3ELpEryZsOSjEgYTK605geIdA62KHipRYON1jN/zkiqQZGBm0KxZhLk9L4ElMyasjQW0 66VxWAy/NtKXw7mwYiStg3XZSezuaYKMZWPAt3cDbNAOu1Y98syq46Y4o/54ZcPBwe2V 19Y/uOaEEbfzarONYn+1jRMFTdUFGGjMgZizQpk00bZahqrbEGSCDdrxYMZtNHjgNNPz cSz04MN5afU7riHlu1k19g5naPrPM591MhVPFhhtUNmBRppY/RsitCIfKZDWMWiUtpph dlzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=+HqP+4pv65tNa2gKmR3/Ufe3GB9e142iVvYjLgWUjzs=; b=ARHLFAbw0PhyH3MANuUtjuGWdtaN187KMcPnhSxwuft4J7bLLqbYcUAsaz8Vs/o13D XAd8g6pcqXhswzqglyscjlbQp9/+tycqzxgut56/VcY0XE0OlJfxRiCQFIWJY9tL69ko 1YBYD8MkFuFEqLjt9ZZ4iMukFJJpIEafstO65rl/mummlxgVlNwkK1brT9g29hDDQO8J N1NhShkLYgiedySILzdMCp5hkkgwqLCoCBTwqewpUEK2+JMswhmBfd/k2PvdhoCYJYPF DfjkwrCDcw6/t4wSXrBmUnAby6lhQpSwRFevsZbzjHxNawtiYTeh8kLbqfc6b+XhezJS J6eQ== X-Gm-Message-State: APjAAAXJr3uvPrLJVqsiL4DcRqS2ULcrmE+h7hSo3rLcA7iMRUDy1wWK zNbSwB/WvabRLoijyOi2HOyf1DTxLBzF X-Google-Smtp-Source: APXvYqzKo2keGutY0WAmSicGGy79WlONhp388LEj+kDtrBSYPCwdd+l0xdDVmntBheaVzDOJEev9szmYkbp9 X-Received: by 2002:a63:a511:: with SMTP id n17mr33958608pgf.338.1579113809055; Wed, 15 Jan 2020 10:43:29 -0800 (PST) Date: Wed, 15 Jan 2020 10:43:02 -0800 In-Reply-To: <20200115184308.162644-1-brianvv@google.com> Message-Id: <20200115184308.162644-4-brianvv@google.com> Mime-Version: 1.0 References: <20200115184308.162644-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v5 bpf-next 3/9] bpf: add generic support for update and delete batch ops From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This commit adds generic support for update and delete batch ops that can be used for almost all the bpf maps. These commands share the same UAPI attr that lookup and lookup_and_delete batch ops use and the syscall commands are: BPF_MAP_UPDATE_BATCH BPF_MAP_DELETE_BATCH The main difference between update/delete and lookup batch ops is that for update/delete keys/values must be specified for userspace and because of that, neither in_batch nor out_batch are used. Suggested-by: Stanislav Fomichev Signed-off-by: Brian Vazquez Signed-off-by: Yonghong Song --- include/linux/bpf.h | 10 ++++ include/uapi/linux/bpf.h | 2 + kernel/bpf/syscall.c | 115 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 127 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 807744ecaa5a1..05466ad6cf1c5 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -46,6 +46,10 @@ struct bpf_map_ops { void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key); int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr, union bpf_attr __user *uattr); + int (*map_update_batch)(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr); + int (*map_delete_batch)(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr); /* funcs callable from userspace and from eBPF programs */ void *(*map_lookup_elem)(struct bpf_map *map, void *key); @@ -987,6 +991,12 @@ void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr); int generic_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr, union bpf_attr __user *uattr); +int generic_map_update_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr); +int generic_map_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr); extern int sysctl_unprivileged_bpf_disabled; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 8185f1542daa1..e8df9ca680e0c 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -108,6 +108,8 @@ enum bpf_cmd { BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, BPF_MAP_LOOKUP_BATCH, + BPF_MAP_UPDATE_BATCH, + BPF_MAP_DELETE_BATCH, }; enum bpf_map_type { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index d604ddbb1afbe..ce8244d1ba99c 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1218,6 +1218,111 @@ static int map_get_next_key(union bpf_attr *attr) return err; } +int generic_map_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + void __user *keys = u64_to_user_ptr(attr->batch.keys); + u32 cp, max_count; + int err = 0; + void *key; + + if (attr->batch.elem_flags & ~BPF_F_LOCK) + return -EINVAL; + + if ((attr->batch.elem_flags & BPF_F_LOCK) && + !map_value_has_spin_lock(map)) { + return -EINVAL; + } + + max_count = attr->batch.count; + if (!max_count) + return 0; + + for (cp = 0; cp < max_count; cp++) { + key = __bpf_copy_key(keys + cp * map->key_size, map->key_size); + if (IS_ERR(key)) { + err = PTR_ERR(key); + break; + } + + if (bpf_map_is_dev_bound(map)) { + err = bpf_map_offload_delete_elem(map, key); + break; + } + + preempt_disable(); + __this_cpu_inc(bpf_prog_active); + rcu_read_lock(); + err = map->ops->map_delete_elem(map, key); + rcu_read_unlock(); + __this_cpu_dec(bpf_prog_active); + preempt_enable(); + maybe_wait_bpf_programs(map); + if (err) + break; + } + if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp))) + err = -EFAULT; + return err; +} + +int generic_map_update_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + void __user *values = u64_to_user_ptr(attr->batch.values); + void __user *keys = u64_to_user_ptr(attr->batch.keys); + u32 value_size, cp, max_count; + int ufd = attr->map_fd; + void *key, *value; + struct fd f; + int err = 0; + + f = fdget(ufd); + if (attr->batch.elem_flags & ~BPF_F_LOCK) + return -EINVAL; + + if ((attr->batch.elem_flags & BPF_F_LOCK) && + !map_value_has_spin_lock(map)) { + return -EINVAL; + } + + value_size = bpf_map_value_size(map); + + max_count = attr->batch.count; + if (!max_count) + return 0; + + value = kmalloc(value_size, GFP_USER | __GFP_NOWARN); + if (!value) + return -ENOMEM; + + for (cp = 0; cp < max_count; cp++) { + key = __bpf_copy_key(keys + cp * map->key_size, map->key_size); + if (IS_ERR(key)) { + err = PTR_ERR(key); + break; + } + err = -EFAULT; + if (copy_from_user(value, values + cp * value_size, value_size)) + break; + + err = bpf_map_update_value(map, f, key, value, + attr->batch.elem_flags); + + if (err) + break; + } + + if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp))) + err = -EFAULT; + + kfree(value); + kfree(key); + return err; +} + #define MAP_LOOKUP_RETRIES 3 int generic_map_lookup_batch(struct bpf_map *map, @@ -3219,6 +3324,10 @@ static int bpf_map_do_batch(const union bpf_attr *attr, if (cmd == BPF_MAP_LOOKUP_BATCH) BPF_DO_BATCH(map->ops->map_lookup_batch); + else if (cmd == BPF_MAP_UPDATE_BATCH) + BPF_DO_BATCH(map->ops->map_update_batch); + else + BPF_DO_BATCH(map->ops->map_delete_batch); err_put: fdput(f); @@ -3325,6 +3434,12 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz case BPF_MAP_LOOKUP_BATCH: err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH); break; + case BPF_MAP_UPDATE_BATCH: + err = bpf_map_do_batch(&attr, uattr, BPF_MAP_UPDATE_BATCH); + break; + case BPF_MAP_DELETE_BATCH: + err = bpf_map_do_batch(&attr, uattr, BPF_MAP_DELETE_BATCH); + break; default: err = -EINVAL; break; From patchwork Wed Jan 15 18:43:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1223795 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=u6yLAPqV; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47ybp022yDz9sR1 for ; Thu, 16 Jan 2020 05:44:08 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729354AbgAOSng (ORCPT ); Wed, 15 Jan 2020 13:43:36 -0500 Received: from mail-pf1-f202.google.com ([209.85.210.202]:55708 "EHLO mail-pf1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729268AbgAOSnc (ORCPT ); Wed, 15 Jan 2020 13:43:32 -0500 Received: by mail-pf1-f202.google.com with SMTP id 8so11442012pfb.22 for ; Wed, 15 Jan 2020 10:43:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=bTzv6GNpgfrRZmeHOV0jVc+oFdZ9HuTWtPYHIUcg/W4=; b=u6yLAPqVa3ro09ZQhcPDbZBH6GA5ByMOsoiAYioTquy1hYzT8AeaEuB4kL/6IqUBPx 0AqW1RIVD5mdOKoW9TemSg0AcKRhjLT3fUbCuIyhuy1yWsB7VlhrjwJkPRNshtCrszdb F/1Pk3RRjKtgGFtVaKh/hwIV28EmWUB8esDtd+10tZGMRs2tIXL1Eh8Dl72vyVMDfkgn Vl1LasdRigJ4pIrOOmCVj6GfgOZDFO2RQX1Gqxv2zrTraGVkFdNO+Iv+F+YDswQTy/dz MtqEXUYbt2z18+QEjMLnKygiEOVfMNunMnm29fORAte0NRZSBRmHgylNnqlh76NQSB7G GH9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=bTzv6GNpgfrRZmeHOV0jVc+oFdZ9HuTWtPYHIUcg/W4=; b=jHIf4H2jOIv3nXVktAiZyONQs7h3qngyNq0LBPs3YXFjDS1DF89qhBnkKbz6GrbVVh ZA7GfcPnpddDzyn+bU35DzOy+1OAiN+ndR5L2Skj3kamqK8Gn1NM0Cvh19jTiYSOKug0 QALllLtFLEqI7f0K4GJ/gH3aHDtl8aJ1AlJ5AcojsUGijx6LQhmaeOHXUuN02zT9nB5N em2+Qcec7MFFAJhYhsXvGSVgJxgxcrrhglLanS3ERWdaZ13bwUopZ039ek61xdU1C6xw YQoA0ChUSkaDJQp7OAQxOLqaHVci2AYRoJjPYdimX7g1ZKfgpWderzElyUpitkMb3E7v cmnw== X-Gm-Message-State: APjAAAWM1xBw7fPQi2FIErH46R7dbIfXZtefnZNTRF7AHu//kERDWOjG H4uaayxcqoG7jMYBcv507jhELRvU9rkb X-Google-Smtp-Source: APXvYqxTfe7jLbFuDD+gZt1hs9KW/RLDoX7XAgEuKsCqDnYZQRqqvUy6B5pjHryPABA00K3D9D/UlRd4jAFC X-Received: by 2002:a65:578e:: with SMTP id b14mr34722927pgr.444.1579113811637; Wed, 15 Jan 2020 10:43:31 -0800 (PST) Date: Wed, 15 Jan 2020 10:43:03 -0800 In-Reply-To: <20200115184308.162644-1-brianvv@google.com> Message-Id: <20200115184308.162644-5-brianvv@google.com> Mime-Version: 1.0 References: <20200115184308.162644-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v5 bpf-next 4/9] bpf: add lookup and update batch ops to arraymap From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This adds the generic batch ops functionality to bpf arraymap, note that since deletion is not a valid operation for arraymap, only batch and lookup are added. Signed-off-by: Brian Vazquez Acked-by: Yonghong Song --- kernel/bpf/arraymap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index f0d19bbb9211e..95d77770353c9 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -503,6 +503,8 @@ const struct bpf_map_ops array_map_ops = { .map_mmap = array_map_mmap, .map_seq_show_elem = array_map_seq_show_elem, .map_check_btf = array_map_check_btf, + .map_lookup_batch = generic_map_lookup_batch, + .map_update_batch = generic_map_update_batch, }; const struct bpf_map_ops percpu_array_map_ops = { From patchwork Wed Jan 15 18:43:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1223793 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=gSilzt0P; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47ybnw1BN3z9sRW for ; Thu, 16 Jan 2020 05:44:04 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729377AbgAOSnh (ORCPT ); Wed, 15 Jan 2020 13:43:37 -0500 Received: from mail-pl1-f201.google.com ([209.85.214.201]:36726 "EHLO mail-pl1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729287AbgAOSnf (ORCPT ); Wed, 15 Jan 2020 13:43:35 -0500 Received: by mail-pl1-f201.google.com with SMTP id t3so7358817plz.3 for ; Wed, 15 Jan 2020 10:43:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=BUa2sGYQ5pE+WXHCee7JUKHRZrR8FCZKjHLXT2C4YTU=; b=gSilzt0Py5C7B8PeUlFXzIaWrEfWpYQRf7qJRuSoPmzaRdLtoefLCqKt4Bd8BqrOxb FR4EZ9Dfi0pAql/EDoC+U4z7Cpv32ErvwhPi3s6L92XAiOSWColy/iTzK2dMdglRJft0 rl53cee7V16Sbth0ryq16nBR36tlQ55YpoHyiKuNGS+W0eN9d94ZWB4tvPfh/jo6JVpQ N40k0gTCWVUYQH4JalFly6SXbfJnbW0cSJ913MaU924kyhc+M0aTcgXTfUThoXwfuzCO eug3jEJP0/WGDuo+CCI+H/4Bgr2FJ76wMR3+Bpz69chgmlh1vQRLaygc9Lgz1O81TYUv 6++Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=BUa2sGYQ5pE+WXHCee7JUKHRZrR8FCZKjHLXT2C4YTU=; b=hmHZxN+VycgNKDOKlp1c6yM+mXMPMYiA1t/Go+EwTHVk2o9s30vLAxvcGmSwAYaJwk nDXmXiT1NpZXp5LdAifpvyBNRAirbqkgdo7D3HZPkIAb/qHmfW+ETI5IXIPICjOvtXc1 pzXqi8vdxmmADwldkCOH8TWIvnDV4W9KJX+A/mHX4Jpeqmhi8KYPNcuTzLtTIUFwEofC mvty4cQOJSSyj4V2I6p4dF8LmydQpjuxNajCJgXKKRwBJUihs6UqhOCxsii/PSY8kkir UWIKw28ioTieHBu9Qv7R2KquQzhQSPrFNyslYVfKkoI/V6FCxvfBL7di7ld0D0M7rKiD V3bA== X-Gm-Message-State: APjAAAVTBWyBXGHJYiVCgXHcrqJL0Ol5/ZTnQMtGbiP5vG9jPxUFLAs2 zh8hrhm311XVJ0q5LUvWj1B5gFbHI/aP X-Google-Smtp-Source: APXvYqzv5gI0ULsp/bBoCIzGtI5FXE2rHyEEXw9SBUXCCbdcMUiowT1j+VNUPjxnoYXdFnUN4vSJAglqtm4Y X-Received: by 2002:a63:1322:: with SMTP id i34mr35499353pgl.163.1579113814157; Wed, 15 Jan 2020 10:43:34 -0800 (PST) Date: Wed, 15 Jan 2020 10:43:04 -0800 In-Reply-To: <20200115184308.162644-1-brianvv@google.com> Message-Id: <20200115184308.162644-6-brianvv@google.com> Mime-Version: 1.0 References: <20200115184308.162644-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v5 bpf-next 5/9] bpf: add batch ops to all htab bpf map From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yonghong Song htab can't use generic batch support due some problematic behaviours inherent to the data structre, i.e. while iterating the bpf map a concurrent program might delete the next entry that batch was about to use, in that case there's no easy solution to retrieve the next entry, the issue has been discussed multiple times (see [1] and [2]). The only way hmap can be traversed without the problem previously exposed is by making sure that the map is traversing entire buckets. This commit implements those strict requirements for hmap, the implementation follows the same interaction that generic support with some exceptions: - If keys/values buffer are not big enough to traverse a bucket, ENOSPC will be returned. - out_batch contains the value of the next bucket in the iteration, not the next key, but this is transparent for the user since the user should never use out_batch for other than bpf batch syscalls. This commits implements BPF_MAP_LOOKUP_BATCH and adds support for new command BPF_MAP_LOOKUP_AND_DELETE_BATCH. Note that for update/delete batch ops it is possible to use the generic implementations. [1] https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/ [2] https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/ Signed-off-by: Yonghong Song Signed-off-by: Brian Vazquez --- include/linux/bpf.h | 3 + include/uapi/linux/bpf.h | 1 + kernel/bpf/hashtab.c | 264 +++++++++++++++++++++++++++++++++++++++ kernel/bpf/syscall.c | 9 +- 4 files changed, 276 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 05466ad6cf1c5..3517e32149a4f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -46,6 +46,9 @@ struct bpf_map_ops { void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key); int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr, union bpf_attr __user *uattr); + int (*map_lookup_and_delete_batch)(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr); int (*map_update_batch)(struct bpf_map *map, const union bpf_attr *attr, union bpf_attr __user *uattr); int (*map_delete_batch)(struct bpf_map *map, const union bpf_attr *attr, diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index e8df9ca680e0c..9536729a03d57 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -108,6 +108,7 @@ enum bpf_cmd { BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, BPF_MAP_LOOKUP_BATCH, + BPF_MAP_LOOKUP_AND_DELETE_BATCH, BPF_MAP_UPDATE_BATCH, BPF_MAP_DELETE_BATCH, }; diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 22066a62c8c97..2d182c4ee9d99 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -17,6 +17,16 @@ (BPF_F_NO_PREALLOC | BPF_F_NO_COMMON_LRU | BPF_F_NUMA_NODE | \ BPF_F_ACCESS_MASK | BPF_F_ZERO_SEED) +#define BATCH_OPS(_name) \ + .map_lookup_batch = \ + _name##_map_lookup_batch, \ + .map_lookup_and_delete_batch = \ + _name##_map_lookup_and_delete_batch, \ + .map_update_batch = \ + generic_map_update_batch, \ + .map_delete_batch = \ + generic_map_delete_batch + struct bucket { struct hlist_nulls_head head; raw_spinlock_t lock; @@ -1232,6 +1242,256 @@ static void htab_map_seq_show_elem(struct bpf_map *map, void *key, rcu_read_unlock(); } +static int +__htab_map_lookup_and_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr, + bool do_delete, bool is_lru_map, + bool is_percpu) +{ + struct bpf_htab *htab = container_of(map, struct bpf_htab, map); + u32 bucket_cnt, total, key_size, value_size, roundup_key_size; + void *keys = NULL, *values = NULL, *value, *dst_key, *dst_val; + void __user *uvalues = u64_to_user_ptr(attr->batch.values); + void __user *ukeys = u64_to_user_ptr(attr->batch.keys); + void *ubatch = u64_to_user_ptr(attr->batch.in_batch); + u32 batch, max_count, size, bucket_size; + u64 elem_map_flags, map_flags; + struct hlist_nulls_head *head; + struct hlist_nulls_node *n; + unsigned long flags; + struct htab_elem *l; + struct bucket *b; + int ret = 0; + + elem_map_flags = attr->batch.elem_flags; + if ((elem_map_flags & ~BPF_F_LOCK) || + ((elem_map_flags & BPF_F_LOCK) && !map_value_has_spin_lock(map))) + return -EINVAL; + + map_flags = attr->batch.flags; + if (map_flags) + return -EINVAL; + + max_count = attr->batch.count; + if (!max_count) + return 0; + + if (put_user(0, &uattr->batch.count)) + return -EFAULT; + + batch = 0; + if (ubatch && copy_from_user(&batch, ubatch, sizeof(batch))) + return -EFAULT; + + if (batch >= htab->n_buckets) + return -ENOENT; + + key_size = htab->map.key_size; + roundup_key_size = round_up(htab->map.key_size, 8); + value_size = htab->map.value_size; + size = round_up(value_size, 8); + if (is_percpu) + value_size = size * num_possible_cpus(); + total = 0; + /* while experimenting with hash tables with sizes ranging from 10 to + * 1000, it was observed that a bucket can have upto 5 entries. + */ + bucket_size = 5; + +alloc: + /* We cannot do copy_from_user or copy_to_user inside + * the rcu_read_lock. Allocate enough space here. + */ + keys = kvmalloc(key_size * bucket_size, GFP_USER | __GFP_NOWARN); + values = kvmalloc(value_size * bucket_size, GFP_USER | __GFP_NOWARN); + if (!keys || !values) { + ret = -ENOMEM; + goto after_loop; + } + +again: + preempt_disable(); + this_cpu_inc(bpf_prog_active); + rcu_read_lock(); +again_nocopy: + dst_key = keys; + dst_val = values; + b = &htab->buckets[batch]; + head = &b->head; + raw_spin_lock_irqsave(&b->lock, flags); + + bucket_cnt = 0; + hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) + bucket_cnt++; + + if (bucket_cnt > (max_count - total)) { + if (total == 0) + ret = -ENOSPC; + raw_spin_unlock_irqrestore(&b->lock, flags); + rcu_read_unlock(); + this_cpu_dec(bpf_prog_active); + preempt_enable(); + goto after_loop; + } + + if (bucket_cnt > bucket_size) { + bucket_size = bucket_cnt; + raw_spin_unlock_irqrestore(&b->lock, flags); + rcu_read_unlock(); + this_cpu_dec(bpf_prog_active); + preempt_enable(); + kvfree(keys); + kvfree(values); + goto alloc; + } + + hlist_nulls_for_each_entry_safe(l, n, head, hash_node) { + memcpy(dst_key, l->key, key_size); + + if (is_percpu) { + int off = 0, cpu; + void __percpu *pptr; + + pptr = htab_elem_get_ptr(l, map->key_size); + for_each_possible_cpu(cpu) { + bpf_long_memcpy(dst_val + off, + per_cpu_ptr(pptr, cpu), size); + off += size; + } + } else { + value = l->key + roundup_key_size; + if (elem_map_flags & BPF_F_LOCK) + copy_map_value_locked(map, dst_val, value, + true); + else + copy_map_value(map, dst_val, value); + check_and_init_map_lock(map, dst_val); + } + if (do_delete) { + hlist_nulls_del_rcu(&l->hash_node); + if (is_lru_map) + bpf_lru_push_free(&htab->lru, &l->lru_node); + else + free_htab_elem(htab, l); + } + dst_key += key_size; + dst_val += value_size; + } + + raw_spin_unlock_irqrestore(&b->lock, flags); + /* If we are not copying data, we can go to next bucket and avoid + * unlocking the rcu. + */ + if (!bucket_cnt && (batch + 1 < htab->n_buckets)) { + batch++; + goto again_nocopy; + } + + rcu_read_unlock(); + this_cpu_dec(bpf_prog_active); + preempt_enable(); + if (bucket_cnt && (copy_to_user(ukeys + total * key_size, keys, + key_size * bucket_cnt) || + copy_to_user(uvalues + total * value_size, values, + value_size * bucket_cnt))) { + ret = -EFAULT; + goto after_loop; + } + + total += bucket_cnt; + batch++; + if (batch >= htab->n_buckets) { + ret = -ENOENT; + goto after_loop; + } + goto again; + +after_loop: + if (ret == -EFAULT) + goto out; + + /* copy # of entries and next batch */ + ubatch = u64_to_user_ptr(attr->batch.out_batch); + if (copy_to_user(ubatch, &batch, sizeof(batch)) || + put_user(total, &uattr->batch.count)) + ret = -EFAULT; + +out: + kvfree(keys); + kvfree(values); + return ret; +} + +static int +htab_percpu_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, false, + false, true); +} + +static int +htab_percpu_map_lookup_and_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, true, + false, true); +} + +static int +htab_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, false, + false, false); +} + +static int +htab_map_lookup_and_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, true, + false, false); +} + +static int +htab_lru_percpu_map_lookup_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, false, + true, true); +} + +static int +htab_lru_percpu_map_lookup_and_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, true, + true, true); +} + +static int +htab_lru_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, false, + true, false); +} + +static int +htab_lru_map_lookup_and_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, true, + true, false); +} + const struct bpf_map_ops htab_map_ops = { .map_alloc_check = htab_map_alloc_check, .map_alloc = htab_map_alloc, @@ -1242,6 +1502,7 @@ const struct bpf_map_ops htab_map_ops = { .map_delete_elem = htab_map_delete_elem, .map_gen_lookup = htab_map_gen_lookup, .map_seq_show_elem = htab_map_seq_show_elem, + BATCH_OPS(htab), }; const struct bpf_map_ops htab_lru_map_ops = { @@ -1255,6 +1516,7 @@ const struct bpf_map_ops htab_lru_map_ops = { .map_delete_elem = htab_lru_map_delete_elem, .map_gen_lookup = htab_lru_map_gen_lookup, .map_seq_show_elem = htab_map_seq_show_elem, + BATCH_OPS(htab_lru), }; /* Called from eBPF program */ @@ -1368,6 +1630,7 @@ const struct bpf_map_ops htab_percpu_map_ops = { .map_update_elem = htab_percpu_map_update_elem, .map_delete_elem = htab_map_delete_elem, .map_seq_show_elem = htab_percpu_map_seq_show_elem, + BATCH_OPS(htab_percpu), }; const struct bpf_map_ops htab_lru_percpu_map_ops = { @@ -1379,6 +1642,7 @@ const struct bpf_map_ops htab_lru_percpu_map_ops = { .map_update_elem = htab_lru_percpu_map_update_elem, .map_delete_elem = htab_lru_map_delete_elem, .map_seq_show_elem = htab_percpu_map_seq_show_elem, + BATCH_OPS(htab_lru_percpu), }; static int fd_htab_map_alloc_check(union bpf_attr *attr) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index ce8244d1ba99c..0d94d361379c3 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3310,7 +3310,8 @@ static int bpf_map_do_batch(const union bpf_attr *attr, if (IS_ERR(map)) return PTR_ERR(map); - if (cmd == BPF_MAP_LOOKUP_BATCH && + if ((cmd == BPF_MAP_LOOKUP_BATCH || + cmd == BPF_MAP_LOOKUP_AND_DELETE_BATCH) && !(map_get_sys_perms(map, f) & FMODE_CAN_READ)) { err = -EPERM; goto err_put; @@ -3324,6 +3325,8 @@ static int bpf_map_do_batch(const union bpf_attr *attr, if (cmd == BPF_MAP_LOOKUP_BATCH) BPF_DO_BATCH(map->ops->map_lookup_batch); + else if (cmd == BPF_MAP_LOOKUP_AND_DELETE_BATCH) + BPF_DO_BATCH(map->ops->map_lookup_and_delete_batch); else if (cmd == BPF_MAP_UPDATE_BATCH) BPF_DO_BATCH(map->ops->map_update_batch); else @@ -3434,6 +3437,10 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz case BPF_MAP_LOOKUP_BATCH: err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH); break; + case BPF_MAP_LOOKUP_AND_DELETE_BATCH: + err = bpf_map_do_batch(&attr, uattr, + BPF_MAP_LOOKUP_AND_DELETE_BATCH); + break; case BPF_MAP_UPDATE_BATCH: err = bpf_map_do_batch(&attr, uattr, BPF_MAP_UPDATE_BATCH); break; From patchwork Wed Jan 15 18:43:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1223792 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=uRu8Ql4O; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47ybnv6Q1Sz9sRR for ; Thu, 16 Jan 2020 05:44:03 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729160AbgAOSoC (ORCPT ); Wed, 15 Jan 2020 13:44:02 -0500 Received: from mail-pl1-f202.google.com ([209.85.214.202]:33092 "EHLO mail-pl1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729388AbgAOSnh (ORCPT ); Wed, 15 Jan 2020 13:43:37 -0500 Received: by mail-pl1-f202.google.com with SMTP id bd7so7340660plb.0 for ; Wed, 15 Jan 2020 10:43:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=xQls5+iJi5Kqz7m3pR6x7QjrYUU6eZohn0NALbSGdKM=; b=uRu8Ql4OQe2tRaLyN4BrE2DzO8IeVWLi2wpeas1c8W1cbwT50VQ4hrVy7LBKcfIDne kbAGW11znsVoK9quuUOlLQ0elOv7VjBCiXAfr6BresyWbkaGuazrWHb+leZly2Ph3DZI O6mDiFPggpkCo+c1kyW6hySnLvFvMCP2sK1liTK0B1Krkycm4prBumLGlVIsfTkVLxZF AbD6D2M4jpR2guEvJrYljDLBvS540vq03Wd32exylK90h+AOlHeOc1a+Q1GaRIugverK Mg5oSJUGhqrWgHoCvlIIB7seQwLdKPFBPRRniyH3BZj0OHNvS1z2Xdjqn8rE6neaiFK1 qVbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=xQls5+iJi5Kqz7m3pR6x7QjrYUU6eZohn0NALbSGdKM=; b=n0T001nqBDfcs4eNGv/yF1Vk16P0QWWACT8C8UW/Lwfn5F5lzN/shfnotM+6WZYCoO qF6d7pUWUnfyk7shIv/66NTBBpTohohKBbou0o2TdWOuI+1MN4ZZl0xtVGXnPERxrhtc bBH7mmB2uGFLFhohqoxeQOnUSAKYE4XWGBhJ0gvr2Gx9oX/gLx7zGRcss0bgqaWF1LbI zEq2ufjthgHniNBnhuhF0lKjAaSnfuSxROpWufYLuCNptGGe4lCLnvkONT9U87tjRuHz xg5mZwKulOdp99oAfT8MhiV6WX/e/+BSmfoWTDkwaD8A1MSJQfIEbpZ12RSaGZgYABO5 ywgw== X-Gm-Message-State: APjAAAUQOpIURBswliGow7VUX4SQ7B8Bq3FOFU1nYyKNJDeQesoDNKax nWIEOg+/nHMglk0u9tYc54yQBnc72+sm X-Google-Smtp-Source: APXvYqw5Ows/8hwQ5OpgZH3vnFhIClHIpjJvEHfV11l+uWhlycMBwr24JyLQtkvI1Jahpm1ssHhjFsYA51Ri X-Received: by 2002:a63:a34b:: with SMTP id v11mr33495694pgn.229.1579113816781; Wed, 15 Jan 2020 10:43:36 -0800 (PST) Date: Wed, 15 Jan 2020 10:43:05 -0800 In-Reply-To: <20200115184308.162644-1-brianvv@google.com> Message-Id: <20200115184308.162644-7-brianvv@google.com> Mime-Version: 1.0 References: <20200115184308.162644-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v5 bpf-next 6/9] tools/bpf: sync uapi header bpf.h From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Yonghong Song sync uapi header include/uapi/linux/bpf.h to tools/include/uapi/linux/bpf.h Signed-off-by: Yonghong Song --- tools/include/uapi/linux/bpf.h | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 52966e758fe59..9536729a03d57 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -107,6 +107,10 @@ enum bpf_cmd { BPF_MAP_LOOKUP_AND_DELETE_ELEM, BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, + BPF_MAP_LOOKUP_BATCH, + BPF_MAP_LOOKUP_AND_DELETE_BATCH, + BPF_MAP_UPDATE_BATCH, + BPF_MAP_DELETE_BATCH, }; enum bpf_map_type { @@ -420,6 +424,23 @@ union bpf_attr { __u64 flags; }; + struct { /* struct used by BPF_MAP_*_BATCH commands */ + __aligned_u64 in_batch; /* start batch, + * NULL to start from beginning + */ + __aligned_u64 out_batch; /* output: next start batch */ + __aligned_u64 keys; + __aligned_u64 values; + __u32 count; /* input/output: + * input: # of key/value + * elements + * output: # of filled elements + */ + __u32 map_fd; + __u64 elem_flags; + __u64 flags; + } batch; + struct { /* anonymous struct used by BPF_PROG_LOAD command */ __u32 prog_type; /* one of enum bpf_prog_type */ __u32 insn_cnt; From patchwork Wed Jan 15 18:43:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1223786 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=BKVmoDfc; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47ybnV1bsVz9sRG for ; Thu, 16 Jan 2020 05:43:42 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729416AbgAOSnl (ORCPT ); Wed, 15 Jan 2020 13:43:41 -0500 Received: from mail-pf1-f202.google.com ([209.85.210.202]:51056 "EHLO mail-pf1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729409AbgAOSnk (ORCPT ); Wed, 15 Jan 2020 13:43:40 -0500 Received: by mail-pf1-f202.google.com with SMTP id g69so11440358pfb.17 for ; Wed, 15 Jan 2020 10:43:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=RCyrkp6zT5EPgtnS5mYgbeYBhIOvjRmdaqV1JP6GZIM=; b=BKVmoDfcMdak267QSTOLD7Kr7mys7DR6FMqoEiJoy7U2bcelFMYINew1NNR4zVak31 1i6qmI3np3uqMC+7PEJQ6MzpuUDdRc35ffj1NQIp5imJ8flKY6SahNdy07V/tr+Ohsd8 gwd1pTlMOero2dI3lesFWGgb+JTkeQLlYBqB4sAAeNRdjur4PmVjFzd/Cs4w/h9SwnJx x0lJR3yiR2n+MOzKWMnuTsmWiZvDGjW7QvUUQkJ2xs8TPH61M5qe9bE2Ms/pAvsSLfVR XZRsFbf6YL02n0R6TYFFBePCYEGjBFV7BL2O6wSuvMR4eC8PJc3+Q3eRxORDxj7Ojuoc j9aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=RCyrkp6zT5EPgtnS5mYgbeYBhIOvjRmdaqV1JP6GZIM=; b=FS7AVP+dzgV2/E/juaxnwwx0EbGsWjqAAyuP3bJu4Qr6TUccZElXbendiGojBoz/06 f3rNIO7DUQ71Kmc+vz+rbhas5sZvUxOHbWHSJWRlr6mBGw3IJkqqR0ROHkNh8B1IJCok /QJlMXD/CUeFwy/GYCXJEeeCdwV/j7wvRiqgZZ+0razavpRmHUe5Y7mhse2pdzKO1PZB gsAnSpYM8jH1qIPqi39WflJCb5h2MWLShjF1771+t9Pvl3B5fv65gk9GtQX61Kg+lryg BdhHBeo0GRcwhmNjcCGuM854OyZXZthjG1ziyn5GHFiAlcTayKLOMcROIWKTJkLIk+Z9 vmFQ== X-Gm-Message-State: APjAAAUZZZWS7nUrJTEFdUkSfLlkLA3Mqn7ayD0WvgWOZ0YtyG0On8QK /nJaqP8zndwwd0qIWDKDPpxXfiyHN5ca X-Google-Smtp-Source: APXvYqxBGB2djjbxfuFloBwlEqrpQuPjeGkUt4dRVDs6EOS8BdEBorETWbyKwbpHtdwR9cBDpJQCgtgjN2lY X-Received: by 2002:a63:d20a:: with SMTP id a10mr33681152pgg.273.1579113819263; Wed, 15 Jan 2020 10:43:39 -0800 (PST) Date: Wed, 15 Jan 2020 10:43:06 -0800 In-Reply-To: <20200115184308.162644-1-brianvv@google.com> Message-Id: <20200115184308.162644-8-brianvv@google.com> Mime-Version: 1.0 References: <20200115184308.162644-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v5 bpf-next 7/9] libbpf: add libbpf support to batch ops From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Yonghong Song Added four libbpf API functions to support map batch operations: . int bpf_map_delete_batch( ... ) . int bpf_map_lookup_batch( ... ) . int bpf_map_lookup_and_delete_batch( ... ) . int bpf_map_update_batch( ... ) Signed-off-by: Yonghong Song --- tools/lib/bpf/bpf.c | 58 ++++++++++++++++++++++++++++++++++++++++ tools/lib/bpf/bpf.h | 22 +++++++++++++++ tools/lib/bpf/libbpf.map | 4 +++ 3 files changed, 84 insertions(+) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 500afe478e94a..317727d612149 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -452,6 +452,64 @@ int bpf_map_freeze(int fd) return sys_bpf(BPF_MAP_FREEZE, &attr, sizeof(attr)); } +static int bpf_map_batch_common(int cmd, int fd, void *in_batch, + void *out_batch, void *keys, void *values, + __u32 *count, + const struct bpf_map_batch_opts *opts) +{ + union bpf_attr attr = {}; + int ret; + + if (!OPTS_VALID(opts, bpf_map_batch_opts)) + return -EINVAL; + + memset(&attr, 0, sizeof(attr)); + attr.batch.map_fd = fd; + attr.batch.in_batch = ptr_to_u64(in_batch); + attr.batch.out_batch = ptr_to_u64(out_batch); + attr.batch.keys = ptr_to_u64(keys); + attr.batch.values = ptr_to_u64(values); + attr.batch.count = *count; + attr.batch.elem_flags = OPTS_GET(opts, elem_flags, 0); + attr.batch.flags = OPTS_GET(opts, flags, 0); + + ret = sys_bpf(cmd, &attr, sizeof(attr)); + *count = attr.batch.count; + + return ret; +} + +int bpf_map_delete_batch(int fd, void *keys, __u32 *count, + const struct bpf_map_batch_opts *opts) +{ + return bpf_map_batch_common(BPF_MAP_DELETE_BATCH, fd, NULL, + NULL, keys, NULL, count, opts); +} + +int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch, void *keys, + void *values, __u32 *count, + const struct bpf_map_batch_opts *opts) +{ + return bpf_map_batch_common(BPF_MAP_LOOKUP_BATCH, fd, in_batch, + out_batch, keys, values, count, opts); +} + +int bpf_map_lookup_and_delete_batch(int fd, void *in_batch, void *out_batch, + void *keys, void *values, __u32 *count, + const struct bpf_map_batch_opts *opts) +{ + return bpf_map_batch_common(BPF_MAP_LOOKUP_AND_DELETE_BATCH, + fd, in_batch, out_batch, keys, values, + count, opts); +} + +int bpf_map_update_batch(int fd, void *keys, void *values, __u32 *count, + const struct bpf_map_batch_opts *opts) +{ + return bpf_map_batch_common(BPF_MAP_UPDATE_BATCH, fd, NULL, NULL, + keys, values, count, opts); +} + int bpf_obj_pin(int fd, const char *pathname) { union bpf_attr attr; diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 56341d117e5ba..b976e77316cca 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -127,6 +127,28 @@ LIBBPF_API int bpf_map_lookup_and_delete_elem(int fd, const void *key, LIBBPF_API int bpf_map_delete_elem(int fd, const void *key); LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key); LIBBPF_API int bpf_map_freeze(int fd); + +struct bpf_map_batch_opts { + size_t sz; /* size of this struct for forward/backward compatibility */ + __u64 elem_flags; + __u64 flags; +}; +#define bpf_map_batch_opts__last_field flags + +LIBBPF_API int bpf_map_delete_batch(int fd, void *keys, + __u32 *count, + const struct bpf_map_batch_opts *opts); +LIBBPF_API int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch, + void *keys, void *values, __u32 *count, + const struct bpf_map_batch_opts *opts); +LIBBPF_API int bpf_map_lookup_and_delete_batch(int fd, void *in_batch, + void *out_batch, void *keys, + void *values, __u32 *count, + const struct bpf_map_batch_opts *opts); +LIBBPF_API int bpf_map_update_batch(int fd, void *keys, void *values, + __u32 *count, + const struct bpf_map_batch_opts *opts); + LIBBPF_API int bpf_obj_pin(int fd, const char *pathname); LIBBPF_API int bpf_obj_get(const char *pathname); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index a19f04e6e3d96..1902a0fc6afcc 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -214,6 +214,10 @@ LIBBPF_0.0.7 { btf_dump__emit_type_decl; bpf_link__disconnect; bpf_map__attach_struct_ops; + bpf_map_delete_batch; + bpf_map_lookup_and_delete_batch; + bpf_map_lookup_batch; + bpf_map_update_batch; bpf_object__find_program_by_name; bpf_object__attach_skeleton; bpf_object__destroy_skeleton; From patchwork Wed Jan 15 18:43:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1223788 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=XqtMHQB+; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47ybnY2H8Gz9sNx for ; Thu, 16 Jan 2020 05:43:45 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729456AbgAOSno (ORCPT ); Wed, 15 Jan 2020 13:43:44 -0500 Received: from mail-yw1-f74.google.com ([209.85.161.74]:40185 "EHLO mail-yw1-f74.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728928AbgAOSnn (ORCPT ); Wed, 15 Jan 2020 13:43:43 -0500 Received: by mail-yw1-f74.google.com with SMTP id v126so20252890ywf.7 for ; Wed, 15 Jan 2020 10:43:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=JYxxPDoIoQLQ7jSjLTKBV8X5KqxpPi921MyZWpnHjqU=; b=XqtMHQB+U0yAw+XVUIxEcaRCAhli3s1ROc/oIWs8LbfkiKTc+0ENuEqOWDku3k/6g2 gxZi+1dKUlCsBMJaK0AKRWk803ini5/XilajvN51/9VAXyLDHmJ1RAvfVd3np5vJWmF2 8VKUVQ4Wwo6ZxclMMmWFCth9emobgxC339Mgo1hLFDpKZ2LQ9EdOYfRsr3/CHadNkBZk 2xNXMntXDCgqvHMbJWSGVwc/b5GySPMgvUI39KMxARm3SC+MsZvroqfBEin0oZz03FiB L5h4poPsQd886MpPGK8wjQvcM5KuI3GRrUIdkZQdDTF+FgXxiIdbo270WL7Q0Zp+Dhyc TKtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=JYxxPDoIoQLQ7jSjLTKBV8X5KqxpPi921MyZWpnHjqU=; b=UPi/khaIFpe1O+UuhTKbiUOQB2IqYB7nLTnnxDFN/Q76KBf4KCmPr1nxr5ke7B79L3 9d5XqmbEsKWXIkzF/G+bQCxlbIKv71xU0Ml1si0CS649SubFeY7FuOTg7NzG73v6p5Ec wqA+lqyFqHbbD4XOi46PC3f2hgfzfzapYe9tfQ8xgayCuGGh5sCq42Pss6gCiAOU7Iez utIfNetW7ycgsqT0snaqGnCwRSPOCPOZMWFYDcWCBgwAyy6DkTY6BUTMIMrfx9wF9DPg Hc/p0BDqA8bW4Fu7Ivr6nua8ntj4F4ri7A1I2UHzvgs7SUN2ZTpJrdWd+oohlxFvIGc5 Q3dw== X-Gm-Message-State: APjAAAXp03A29x4alyqGym37blM+S1vf9JKLLhPSYS1qqGgQmFNdGOCN aEk39C+sGH3YA8zX935gvC9na1LFtKP8 X-Google-Smtp-Source: APXvYqxex6z0GwTJOrUDLiV/7gvzf8X06XuXIyOqPdxxp1iFy4sLec/60idqj4kurlP3GpIUPmEW/77OphKn X-Received: by 2002:a81:9ac8:: with SMTP id r191mr22271033ywg.20.1579113821826; Wed, 15 Jan 2020 10:43:41 -0800 (PST) Date: Wed, 15 Jan 2020 10:43:07 -0800 In-Reply-To: <20200115184308.162644-1-brianvv@google.com> Message-Id: <20200115184308.162644-9-brianvv@google.com> Mime-Version: 1.0 References: <20200115184308.162644-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v5 bpf-next 8/9] selftests/bpf: add batch ops testing for htab and htab_percpu map From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yonghong Song Tested bpf_map_lookup_batch(), bpf_map_lookup_and_delete_batch(), bpf_map_update_batch(), and bpf_map_delete_batch() functionality. $ ./test_maps ... test_htab_map_batch_ops:PASS test_htab_percpu_map_batch_ops:PASS ... Signed-off-by: Yonghong Song Signed-off-by: Brian Vazquez --- .../bpf/map_tests/htab_map_batch_ops.c | 283 ++++++++++++++++++ 1 file changed, 283 insertions(+) create mode 100644 tools/testing/selftests/bpf/map_tests/htab_map_batch_ops.c diff --git a/tools/testing/selftests/bpf/map_tests/htab_map_batch_ops.c b/tools/testing/selftests/bpf/map_tests/htab_map_batch_ops.c new file mode 100644 index 0000000000000..976bf415fbdd9 --- /dev/null +++ b/tools/testing/selftests/bpf/map_tests/htab_map_batch_ops.c @@ -0,0 +1,283 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2019 Facebook */ +#include +#include +#include + +#include +#include + +#include +#include + +static void map_batch_update(int map_fd, __u32 max_entries, int *keys, + void *values, bool is_pcpu) +{ + typedef BPF_DECLARE_PERCPU(int, value); + value *v = NULL; + int i, j, err; + DECLARE_LIBBPF_OPTS(bpf_map_batch_opts, opts, + .elem_flags = 0, + .flags = 0, + ); + + if (is_pcpu) + v = (value *)values; + + for (i = 0; i < max_entries; i++) { + keys[i] = i + 1; + if (is_pcpu) + for (j = 0; j < bpf_num_possible_cpus(); j++) + bpf_percpu(v[i], j) = i + 2 + j; + else + ((int *)values)[i] = i + 2; + } + + err = bpf_map_update_batch(map_fd, keys, values, &max_entries, &opts); + CHECK(err, "bpf_map_update_batch()", "error:%s\n", strerror(errno)); +} + +static void map_batch_verify(int *visited, __u32 max_entries, + int *keys, void *values, bool is_pcpu) +{ + typedef BPF_DECLARE_PERCPU(int, value); + value *v = NULL; + int i, j; + + if (is_pcpu) + v = (value *)values; + + memset(visited, 0, max_entries * sizeof(*visited)); + for (i = 0; i < max_entries; i++) { + + if (is_pcpu) { + for (j = 0; j < bpf_num_possible_cpus(); j++) { + CHECK(keys[i] + 1 + j != bpf_percpu(v[i], j), + "key/value checking", + "error: i %d j %d key %d value %d\n", + i, j, keys[i], bpf_percpu(v[i], j)); + } + } else { + CHECK(keys[i] + 1 != ((int *)values)[i], + "key/value checking", + "error: i %d key %d value %d\n", i, keys[i], + ((int *)values)[i]); + } + + visited[i] = 1; + + } + for (i = 0; i < max_entries; i++) { + CHECK(visited[i] != 1, "visited checking", + "error: keys array at index %d missing\n", i); + } +} + +void __test_map_lookup_and_delete_batch(bool is_pcpu) +{ + __u32 batch, count, total, total_success; + typedef BPF_DECLARE_PERCPU(int, value); + int map_fd, *keys, *visited, key; + const __u32 max_entries = 10; + value pcpu_values[max_entries]; + int err, step, value_size; + bool nospace_err; + void *values; + struct bpf_create_map_attr xattr = { + .name = "hash_map", + .map_type = is_pcpu ? BPF_MAP_TYPE_PERCPU_HASH : + BPF_MAP_TYPE_HASH, + .key_size = sizeof(int), + .value_size = sizeof(int), + }; + DECLARE_LIBBPF_OPTS(bpf_map_batch_opts, opts, + .elem_flags = 0, + .flags = 0, + ); + + xattr.max_entries = max_entries; + map_fd = bpf_create_map_xattr(&xattr); + CHECK(map_fd == -1, + "bpf_create_map_xattr()", "error:%s\n", strerror(errno)); + + value_size = is_pcpu ? sizeof(value) : sizeof(int); + keys = malloc(max_entries * sizeof(int)); + if (is_pcpu) + values = pcpu_values; + else + values = malloc(max_entries * sizeof(int)); + visited = malloc(max_entries * sizeof(int)); + CHECK(!keys || !values || !visited, "malloc()", + "error:%s\n", strerror(errno)); + + /* test 1: lookup/delete an empty hash table, -ENOENT */ + count = max_entries; + err = bpf_map_lookup_and_delete_batch(map_fd, NULL, &batch, keys, + values, &count, &opts); + CHECK((err && errno != ENOENT), "empty map", + "error: %s\n", strerror(errno)); + + /* populate elements to the map */ + map_batch_update(map_fd, max_entries, keys, values, is_pcpu); + + /* test 2: lookup/delete with count = 0, success */ + count = 0; + err = bpf_map_lookup_and_delete_batch(map_fd, NULL, &batch, keys, + values, &count, &opts); + CHECK(err, "count = 0", "error: %s\n", strerror(errno)); + + /* test 3: lookup/delete with count = max_entries, success */ + memset(keys, 0, max_entries * sizeof(*keys)); + memset(values, 0, max_entries * value_size); + count = max_entries; + err = bpf_map_lookup_and_delete_batch(map_fd, NULL, &batch, keys, + values, &count, &opts); + CHECK((err && errno != ENOENT), "count = max_entries", + "error: %s\n", strerror(errno)); + CHECK(count != max_entries, "count = max_entries", + "count = %u, max_entries = %u\n", count, max_entries); + map_batch_verify(visited, max_entries, keys, values, is_pcpu); + + /* bpf_map_get_next_key() should return -ENOENT for an empty map. */ + err = bpf_map_get_next_key(map_fd, NULL, &key); + CHECK(!err, "bpf_map_get_next_key()", "error: %s\n", strerror(errno)); + + /* test 4: lookup/delete in a loop with various steps. */ + total_success = 0; + for (step = 1; step < max_entries; step++) { + map_batch_update(map_fd, max_entries, keys, values, is_pcpu); + memset(keys, 0, max_entries * sizeof(*keys)); + memset(values, 0, max_entries * value_size); + total = 0; + /* iteratively lookup/delete elements with 'step' + * elements each + */ + count = step; + nospace_err = false; + while (true) { + err = bpf_map_lookup_batch(map_fd, + total ? &batch : NULL, + &batch, keys + total, + values + + total * value_size, + &count, &opts); + /* It is possible that we are failing due to buffer size + * not big enough. In such cases, let us just exit and + * go with large steps. Not that a buffer size with + * max_entries should always work. + */ + if (err && errno == ENOSPC) { + nospace_err = true; + break; + } + + CHECK((err && errno != ENOENT), "lookup with steps", + "error: %s\n", strerror(errno)); + + total += count; + if (err) + break; + + } + if (nospace_err == true) + continue; + + CHECK(total != max_entries, "lookup with steps", + "total = %u, max_entries = %u\n", total, max_entries); + map_batch_verify(visited, max_entries, keys, values, is_pcpu); + + total = 0; + count = step; + while (total < max_entries) { + if (max_entries - total < step) + count = max_entries - total; + err = bpf_map_delete_batch(map_fd, + keys + total, + &count, &opts); + CHECK((err && errno != ENOENT), "delete batch", + "error: %s\n", strerror(errno)); + total += count; + if (err) + break; + } + CHECK(total != max_entries, "delete with steps", + "total = %u, max_entries = %u\n", total, max_entries); + + /* check map is empty, errono == ENOENT */ + err = bpf_map_get_next_key(map_fd, NULL, &key); + CHECK(!err || errno != ENOENT, "bpf_map_get_next_key()", + "error: %s\n", strerror(errno)); + + /* iteratively lookup/delete elements with 'step' + * elements each + */ + map_batch_update(map_fd, max_entries, keys, values, is_pcpu); + memset(keys, 0, max_entries * sizeof(*keys)); + memset(values, 0, max_entries * value_size); + total = 0; + count = step; + nospace_err = false; + while (true) { + err = bpf_map_lookup_and_delete_batch(map_fd, + total ? &batch : NULL, + &batch, keys + total, + values + + total * value_size, + &count, &opts); + /* It is possible that we are failing due to buffer size + * not big enough. In such cases, let us just exit and + * go with large steps. Not that a buffer size with + * max_entries should always work. + */ + if (err && errno == ENOSPC) { + nospace_err = true; + break; + } + + CHECK((err && errno != ENOENT), "lookup with steps", + "error: %s\n", strerror(errno)); + + total += count; + if (err) + break; + } + + if (nospace_err == true) + continue; + + CHECK(total != max_entries, "lookup/delete with steps", + "total = %u, max_entries = %u\n", total, max_entries); + + map_batch_verify(visited, max_entries, keys, values, is_pcpu); + err = bpf_map_get_next_key(map_fd, NULL, &key); + CHECK(!err, "bpf_map_get_next_key()", "error: %s\n", + strerror(errno)); + + total_success++; + } + + CHECK(total_success == 0, "check total_success", + "unexpected failure\n"); + free(keys); + free(visited); + if (!is_pcpu) + free(values); +} + +void htab_map_batch_ops(void) +{ + __test_map_lookup_and_delete_batch(false); + printf("test_%s:PASS\n", __func__); +} + +void htab_percpu_map_batch_ops(void) +{ + __test_map_lookup_and_delete_batch(true); + printf("test_%s:PASS\n", __func__); +} + +void test_htab_map_batch_ops(void) +{ + htab_map_batch_ops(); + htab_percpu_map_batch_ops(); +} From patchwork Wed Jan 15 18:43:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1223789 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=TqC5xz12; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47ybnc1ywXz9sNx for ; Thu, 16 Jan 2020 05:43:48 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729485AbgAOSnr (ORCPT ); Wed, 15 Jan 2020 13:43:47 -0500 Received: from mail-pg1-f201.google.com ([209.85.215.201]:43092 "EHLO mail-pg1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729457AbgAOSnp (ORCPT ); Wed, 15 Jan 2020 13:43:45 -0500 Received: by mail-pg1-f201.google.com with SMTP id d9so10801755pgd.10 for ; Wed, 15 Jan 2020 10:43:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=CjoqkXpvMMXTZKEQf/w0i8b/juoj1r/E3J2QNyeJ/9M=; b=TqC5xz12UjURima8lfp1XbZtXfz5BlbKfHLb2rtwpYfKXcWnW1o/E0/a79pp6vPt86 bSy4ScDBNz2cQpHVG8SM55ZnATHhNBSHfaZe/x8JQM5OgMTOKq/iuNqAOuPi0TWhhQNr rN/EqE86buKkiMeX/zrNcG8h9rjJt+XInft1rm5tA4zfYaVQpmiBEi3VFxfpETEiS3Rr SIenzw0Yd04dIXL7p8WgcNWZRPe+dBVtBCa8pcSXF6WnDqDeEi44xszrKW4NT9zkySmN Zaba+4T+olDYco+0wuUZcnDi1U1/VO05mBXM66N609TMTnFcACohqtJQz53c+3xau9L1 e99Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=CjoqkXpvMMXTZKEQf/w0i8b/juoj1r/E3J2QNyeJ/9M=; b=BWNM3xtoP4UOFC6HOjbCQ1Vs8Z4IOwt2S34MWC2og0zITdvSxm8AeqWqpnlMe8o3Kd /oyUbDnIO97lDHYyZypiiH1p57iCF7/j9em/aqgXe+AaKH1K+BNKQfEPwh6QmANC+9Y4 C73ZX9XMkVw0FbvUrFqrSq//8F8mAcukFef17AyfWA3zLx5+DQp3tgXDtWWNTB4vc6N/ W+ThnHzOtYpt1L9dK9ULsVFIBgPmfMwLeAuA+PSUQp+sOAns/KUwAB8kDyiXjFm1LESk Yrtmwka3YpRQhqebzJvQfz9M4V7PpJAK7PrCQyRMyA3TxgGIgxwMOvKGoeXsijThBwm1 WR2A== X-Gm-Message-State: APjAAAUCOz2FSuDH6kpsUxaWXnyW5wEHlO3tBDO9byoWbPpq8dvfUMqk RbiZY+H5jYBJbZohpHjTpy6c4Wlmnfhs X-Google-Smtp-Source: APXvYqycDVWqkbUZr8RQhY8T/gDGCqw3Zs5lR9QZEKkr7AtGovXy64qQcceBV2uWlcb4hOFSBkfp3MY1nHtd X-Received: by 2002:a63:181:: with SMTP id 123mr33985878pgb.36.1579113824307; Wed, 15 Jan 2020 10:43:44 -0800 (PST) Date: Wed, 15 Jan 2020 10:43:08 -0800 In-Reply-To: <20200115184308.162644-1-brianvv@google.com> Message-Id: <20200115184308.162644-10-brianvv@google.com> Mime-Version: 1.0 References: <20200115184308.162644-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v5 bpf-next 9/9] selftests/bpf: add batch ops testing to array bpf map From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Tested bpf_map_lookup_batch() and bpf_map_update_batch() functionality. $ ./test_maps ... test_array_map_batch_ops:PASS ... Signed-off-by: Brian Vazquez Signed-off-by: Yonghong Song --- .../bpf/map_tests/array_map_batch_ops.c | 129 ++++++++++++++++++ 1 file changed, 129 insertions(+) create mode 100644 tools/testing/selftests/bpf/map_tests/array_map_batch_ops.c diff --git a/tools/testing/selftests/bpf/map_tests/array_map_batch_ops.c b/tools/testing/selftests/bpf/map_tests/array_map_batch_ops.c new file mode 100644 index 0000000000000..f0a64d8ac59a5 --- /dev/null +++ b/tools/testing/selftests/bpf/map_tests/array_map_batch_ops.c @@ -0,0 +1,129 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include + +#include +#include + +#include + +static void map_batch_update(int map_fd, __u32 max_entries, int *keys, + int *values) +{ + int i, err; + DECLARE_LIBBPF_OPTS(bpf_map_batch_opts, opts, + .elem_flags = 0, + .flags = 0, + ); + + for (i = 0; i < max_entries; i++) { + keys[i] = i; + values[i] = i + 1; + } + + err = bpf_map_update_batch(map_fd, keys, values, &max_entries, &opts); + CHECK(err, "bpf_map_update_batch()", "error:%s\n", strerror(errno)); +} + +static void map_batch_verify(int *visited, __u32 max_entries, + int *keys, int *values) +{ + int i; + + memset(visited, 0, max_entries * sizeof(*visited)); + for (i = 0; i < max_entries; i++) { + CHECK(keys[i] + 1 != values[i], "key/value checking", + "error: i %d key %d value %d\n", i, keys[i], values[i]); + visited[i] = 1; + } + for (i = 0; i < max_entries; i++) { + CHECK(visited[i] != 1, "visited checking", + "error: keys array at index %d missing\n", i); + } +} + +void test_array_map_batch_ops(void) +{ + struct bpf_create_map_attr xattr = { + .name = "array_map", + .map_type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(int), + .value_size = sizeof(int), + }; + int map_fd, *keys, *values, *visited; + __u32 count, total, total_success; + const __u32 max_entries = 10; + bool nospace_err; + __u64 batch = 0; + int err, step; + DECLARE_LIBBPF_OPTS(bpf_map_batch_opts, opts, + .elem_flags = 0, + .flags = 0, + ); + + xattr.max_entries = max_entries; + map_fd = bpf_create_map_xattr(&xattr); + CHECK(map_fd == -1, + "bpf_create_map_xattr()", "error:%s\n", strerror(errno)); + + keys = malloc(max_entries * sizeof(int)); + values = malloc(max_entries * sizeof(int)); + visited = malloc(max_entries * sizeof(int)); + CHECK(!keys || !values || !visited, "malloc()", "error:%s\n", + strerror(errno)); + + /* populate elements to the map */ + map_batch_update(map_fd, max_entries, keys, values); + + /* test 1: lookup in a loop with various steps. */ + total_success = 0; + for (step = 1; step < max_entries; step++) { + map_batch_update(map_fd, max_entries, keys, values); + map_batch_verify(visited, max_entries, keys, values); + memset(keys, 0, max_entries * sizeof(*keys)); + memset(values, 0, max_entries * sizeof(*values)); + batch = 0; + total = 0; + /* iteratively lookup/delete elements with 'step' + * elements each. + */ + count = step; + nospace_err = false; + while (true) { + err = bpf_map_lookup_batch(map_fd, + total ? &batch : NULL, &batch, + keys + total, + values + total, + &count, &opts); + + CHECK((err && errno != ENOENT), "lookup with steps", + "error: %s\n", strerror(errno)); + + total += count; + if (err) + break; + + } + + if (nospace_err == true) + continue; + + CHECK(total != max_entries, "lookup with steps", + "total = %u, max_entries = %u\n", total, max_entries); + + map_batch_verify(visited, max_entries, keys, values); + + total_success++; + } + + CHECK(total_success == 0, "check total_success", + "unexpected failure\n"); + + printf("%s:PASS\n", __func__); + + free(keys); + free(values); + free(visited); +}