From patchwork Tue Aug 20 11:47:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 1150055 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46CTZ32NLrz9sBF for ; Tue, 20 Aug 2019 21:47:55 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729809AbfHTLrz (ORCPT ); Tue, 20 Aug 2019 07:47:55 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58166 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729771AbfHTLry (ORCPT ); Tue, 20 Aug 2019 07:47:54 -0400 Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DD08811A04 for ; Tue, 20 Aug 2019 11:47:53 +0000 (UTC) Received: by mail-ed1-f72.google.com with SMTP id x40so4002274edm.4 for ; Tue, 20 Aug 2019 04:47:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=e3hicCK+0LUzOIEFl1a9lvWpJ1r2chg7J2NEMGUmxfc=; b=sEEgz8+JFFw43N6i9st2OFFB2sar/ktSeFMSGr3pBxVzJ28gh/KbN2Jxeq1Mz3UQ6t 4yMU0m/wi1gFlFUjFNwaVT4Zc2jCqPDYdsjBIc0QF4PSN1jqVA72x/2fC5f0Wz6UKDS7 B7F8lKGzrU5quN3Z9o1gpb/3/mqAFx+Z5R5edDkxbRJKOR0330ey5TCZ2uqu+nhCV3G7 cG1vOkPTbjYcHZY6PFea3DpPgnO1yFac/TQBYnr/HB6f4/wFwfr4VvX/5vc30tEs/DFv ImLoUTLDbey6coqWtbAd+CJRQToOZEzbsxqawFg0M/cFwxT070R4Zit5R4H0yuI8Fr3j aijg== X-Gm-Message-State: APjAAAWVxqZSPp+8wCTgRIX9ir2CNCJVnetozYcdRXe6gDjgqE8b9vWi zHLwHFRUVEWGkjTPBQq4n1v5TBnRbMYnSBnJ44wH/E+mVhkTFy1P8XMZM+/7T+4yd1ubYHqGZOm QGAiSxlP5H2c1 X-Received: by 2002:a50:90c4:: with SMTP id d4mr30899590eda.107.1566301672613; Tue, 20 Aug 2019 04:47:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqzX/55Z6rmNfk7pcnzfUrRPTlXmT5VYKM6lfC0cYCQGatrK7ElfM4sNtBOHnFeN/0JQzwrCgA== X-Received: by 2002:a50:90c4:: with SMTP id d4mr30899558eda.107.1566301672322; Tue, 20 Aug 2019 04:47:52 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk (borgediget.toke.dk. [85.204.121.218]) by smtp.gmail.com with ESMTPSA id k12sm3399451edr.84.2019.08.20.04.47.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Aug 2019 04:47:51 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 3ADD1181CE4; Tue, 20 Aug 2019 13:47:51 +0200 (CEST) From: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= To: Stephen Hemminger , Daniel Borkmann , Alexei Starovoitov Cc: Martin KaFai Lau , Song Liu , Yonghong Song , David Miller , Jesper Dangaard Brouer , netdev@vger.kernel.org, bpf@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8il?= =?utf-8?q?and-J=C3=B8rgensen?= Subject: [RFC bpf-next 2/5] libbpf: Add support for auto-pinning of maps with reuse on program load Date: Tue, 20 Aug 2019 13:47:03 +0200 Message-Id: <20190820114706.18546-3-toke@redhat.com> X-Mailer: git-send-email 2.22.1 In-Reply-To: <20190820114706.18546-1-toke@redhat.com> References: <20190820114706.18546-1-toke@redhat.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This adds support for automatically pinning maps on program load to libbpf. This is needed for porting iproute2 bpf support to libbpf, but is also useful in other contexts. The semantics are modelled on those of the same functionality in iproute2, namely: - A path can be supplied in bpf_prog_load_attr specifying the directory that maps should be pinned into. - Only maps that specify a non-zero value in its 'pinning' definition attribute will be pinned in the automatic mode. - If an existing pinning is found at the pinning destination, its attributes will be compared and if they match, the existing map will be reused instead of creating a new map. A subsequent commit will expand the functionality to enable programs to support different pinning paths for different values of the map pinning attribute, similar to what iproute2 does today. Signed-off-by: Toke Høiland-Jørgensen --- tools/lib/bpf/libbpf.c | 161 ++++++++++++++++++++++++++++++++++++++++- tools/lib/bpf/libbpf.h | 8 ++ 2 files changed, 168 insertions(+), 1 deletion(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2233f919dd88..6d372a965c9d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -220,6 +220,7 @@ struct bpf_map { size_t sec_offset; int map_ifindex; int inner_map_fd; + int pin_reused; struct bpf_map_def def; __u32 btf_key_type_id; __u32 btf_value_type_id; @@ -3994,8 +3995,10 @@ int bpf_map__unpin(struct bpf_map *map, const char *path) return 0; } -int bpf_object__pin_maps(struct bpf_object *obj, const char *path) +int bpf_object__pin_maps2(struct bpf_object *obj, const char *path, + enum bpf_pin_mode mode) { + int explicit = (mode == BPF_PIN_MODE_EXPLICIT); struct bpf_map *map; int err; @@ -4015,6 +4018,9 @@ int bpf_object__pin_maps(struct bpf_object *obj, const char *path) char buf[PATH_MAX]; int len; + if ((explicit && !map->def.pinning) || map->pin_reused) + continue; + len = snprintf(buf, PATH_MAX, "%s/%s", path, bpf_map__name(map)); if (len < 0) { @@ -4037,6 +4043,9 @@ int bpf_object__pin_maps(struct bpf_object *obj, const char *path) char buf[PATH_MAX]; int len; + if ((explicit && !map->def.pinning) || map->pin_reused) + continue; + len = snprintf(buf, PATH_MAX, "%s/%s", path, bpf_map__name(map)); if (len < 0) @@ -4050,6 +4059,11 @@ int bpf_object__pin_maps(struct bpf_object *obj, const char *path) return err; } +int bpf_object__pin_maps(struct bpf_object *obj, const char *path) +{ + return bpf_object__pin_maps2(obj, path, BPF_PIN_MODE_ALL); +} + int bpf_object__unpin_maps(struct bpf_object *obj, const char *path) { struct bpf_map *map; @@ -4802,6 +4816,141 @@ int bpf_prog_load(const char *file, enum bpf_prog_type type, return bpf_prog_load_xattr(&attr, pobj, prog_fd); } +static int bpf_read_map_info(int fd, struct bpf_map_def *map, + enum bpf_prog_type *type) +{ + unsigned int val, owner_type = 0; + char file[PATH_MAX], buff[4096]; + FILE *fp; + + snprintf(file, sizeof(file), "/proc/%d/fdinfo/%d", getpid(), fd); + memset(map, 0, sizeof(*map)); + + fp = fopen(file, "r"); + if (!fp) { + pr_warning("No procfs support?!\n"); + return -EIO; + } + + while (fgets(buff, sizeof(buff), fp)) { + if (sscanf(buff, "map_type:\t%u", &val) == 1) + map->type = val; + else if (sscanf(buff, "key_size:\t%u", &val) == 1) + map->key_size = val; + else if (sscanf(buff, "value_size:\t%u", &val) == 1) + map->value_size = val; + else if (sscanf(buff, "max_entries:\t%u", &val) == 1) + map->max_entries = val; + else if (sscanf(buff, "map_flags:\t%i", &val) == 1) + map->map_flags = val; + else if (sscanf(buff, "owner_prog_type:\t%i", &val) == 1) + owner_type = val; + } + + fclose(fp); + if (type) + *type = owner_type; + + return 0; +} + +static void bpf_map_pin_report(const struct bpf_map_def *pin, + const struct bpf_map_def *obj) +{ + pr_warning("Map specification differs from pinned file!\n"); + + if (obj->type != pin->type) + pr_warning(" - Type: %u (obj) != %u (pin)\n", + obj->type, pin->type); + if (obj->key_size != pin->key_size) + pr_warning(" - Key size: %u (obj) != %u (pin)\n", + obj->key_size, pin->key_size); + if (obj->value_size != pin->value_size) + pr_warning(" - Value size: %u (obj) != %u (pin)\n", + obj->value_size, pin->value_size); + if (obj->max_entries != pin->max_entries) + pr_warning(" - Max entries: %u (obj) != %u (pin)\n", + obj->max_entries, pin->max_entries); + if (obj->map_flags != pin->map_flags) + pr_warning(" - Flags: %#x (obj) != %#x (pin)\n", + obj->map_flags, pin->map_flags); + + pr_warning("\n"); +} + + + +static int bpf_map_selfcheck_pinned(int fd, const struct bpf_map_def *map, + int length, enum bpf_prog_type type) +{ + enum bpf_prog_type owner_type = 0; + struct bpf_map_def tmp, zero = {}; + int ret; + + ret = bpf_read_map_info(fd, &tmp, &owner_type); + if (ret < 0) + return ret; + + /* The decision to reject this is on kernel side eventually, but + * at least give the user a chance to know what's wrong. + */ + if (owner_type && owner_type != type) + pr_warning("Program array map owner types differ: %u (obj) != %u (pin)\n", + type, owner_type); + + if (!memcmp(&tmp, map, length)) { + return 0; + } else { + /* If kernel doesn't have eBPF-related fdinfo, we cannot do much, + * so just accept it. We know we do have an eBPF fd and in this + * case, everything is 0. It is guaranteed that no such map exists + * since map type of 0 is unloadable BPF_MAP_TYPE_UNSPEC. + */ + if (!memcmp(&tmp, &zero, length)) + return 0; + + bpf_map_pin_report(&tmp, map); + return -EINVAL; + } +} + + +int bpf_probe_pinned(const struct bpf_map *map, + const struct bpf_prog_load_attr *attr) +{ + const char *name = bpf_map__name(map); + char buf[PATH_MAX]; + int fd, len, ret; + + if (!attr->auto_pin_path) + return -ENOENT; + + len = snprintf(buf, PATH_MAX, "%s/%s", attr->auto_pin_path, + name); + if (len < 0) + return -EINVAL; + else if (len >= PATH_MAX) + return -ENAMETOOLONG; + + fd = bpf_obj_get(buf); + if (fd <= 0) + return fd; + + ret = bpf_map_selfcheck_pinned(fd, &map->def, + offsetof(struct bpf_map_def, + map_id), + attr->prog_type); + if (ret < 0) { + close(fd); + pr_warning("Map \'%s\' self-check failed!\n", name); + return ret; + } + if (attr->log_level) + pr_debug("Map \'%s\' loaded as pinned!\n", name); + + return fd; +} + int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, struct bpf_object **pobj, int *prog_fd) { @@ -4853,8 +5002,14 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, } bpf_object__for_each_map(map, obj) { + int fd; + if (!bpf_map__is_offload_neutral(map)) map->map_ifindex = attr->ifindex; + + fd = bpf_probe_pinned(map, attr); + if (fd > 0 && !bpf_map__reuse_fd(map, fd)) + map->pin_reused = 1; } if (!first_prog) { @@ -4869,6 +5024,10 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, return -EINVAL; } + if (attr->auto_pin_path) + bpf_object__pin_maps2(obj, attr->auto_pin_path, + BPF_PIN_MODE_EXPLICIT); + *pobj = obj; *prog_fd = bpf_program__fd(first_prog); return 0; diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 5facba6ea1e1..3c5c3256e22d 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -67,6 +67,11 @@ struct bpf_object_open_attr { enum bpf_prog_type prog_type; }; +enum bpf_pin_mode { + BPF_PIN_MODE_ALL = 0, + BPF_PIN_MODE_EXPLICIT, +}; + LIBBPF_API struct bpf_object *bpf_object__open(const char *path); LIBBPF_API struct bpf_object * bpf_object__open_xattr(struct bpf_object_open_attr *attr); @@ -79,6 +84,8 @@ int bpf_object__section_size(const struct bpf_object *obj, const char *name, __u32 *size); int bpf_object__variable_offset(const struct bpf_object *obj, const char *name, __u32 *off); +LIBBPF_API int bpf_object__pin_maps2(struct bpf_object *obj, const char *path, + enum bpf_pin_mode mode); LIBBPF_API int bpf_object__pin_maps(struct bpf_object *obj, const char *path); LIBBPF_API int bpf_object__unpin_maps(struct bpf_object *obj, const char *path); @@ -353,6 +360,7 @@ struct bpf_prog_load_attr { int ifindex; int log_level; int prog_flags; + const char *auto_pin_path; }; LIBBPF_API int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,