diff mbox series

[v4,bpf-next,10/14] bpf: allow to specify kernel module BTFs when attaching BPF programs

Message ID 20201202001616.3378929-11-andrii@kernel.org
State Not Applicable
Headers show
Series Support BTF-powered BPF tracing programs for kernel modules | expand

Commit Message

Andrii Nakryiko Dec. 2, 2020, 12:16 a.m. UTC
Add ability for user-space programs to specify non-vmlinux BTF when attaching
BTF-powered BPF programs: raw_tp, fentry/fexit/fmod_ret, LSM, etc. For this,
add attach_btf_obj_id field which contains BTF object ID for either vmlinux or
module. For backwards compatibility (and simplicity) reasons, 0 denotes
vmlinux BTF. Only kernel BTF (vmlinux or module) can be specified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/btf.h            |  2 ++
 include/uapi/linux/bpf.h       |  1 +
 kernel/bpf/btf.c               | 25 +++++++++++++++++++------
 kernel/bpf/syscall.c           | 22 +++++++++++++++++++---
 tools/include/uapi/linux/bpf.h |  1 +
 5 files changed, 42 insertions(+), 9 deletions(-)

Comments

Alexei Starovoitov Dec. 2, 2020, 8:58 p.m. UTC | #1
On Tue, Dec 01, 2020 at 04:16:12PM -0800, Andrii Nakryiko wrote:
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c3458ec1f30a..60b95b51ccb8 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -558,6 +558,7 @@ union bpf_attr {
>  		__u32		line_info_cnt;	/* number of bpf_line_info records */
>  		__u32		attach_btf_id;	/* in-kernel BTF type id to attach to */
>  		__u32		attach_prog_fd; /* 0 to attach to vmlinux */
> +		__u32		attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */

I think the uapi should use attach_btf_obj_fd here.
Everywhere else uapi is using FDs to point to maps, progs, BTFs of progs.
BTF of a module isn't different from BTF of a program.
Looking at libbpf implementation... it has the FD of a module anyway,
since it needs to fetch it to search for the function btf_id in there.
So there won't be any inconvenience for libbpf to pass FD in here.
From the uapi perspective attach_btf_obj_fd will remove potential
race condition. It's very unlikely race, of course.

The rest of the series look good to me.
Andrii Nakryiko Dec. 2, 2020, 10:43 p.m. UTC | #2
On Wed, Dec 2, 2020 at 12:58 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Dec 01, 2020 at 04:16:12PM -0800, Andrii Nakryiko wrote:
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index c3458ec1f30a..60b95b51ccb8 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -558,6 +558,7 @@ union bpf_attr {
> >               __u32           line_info_cnt;  /* number of bpf_line_info records */
> >               __u32           attach_btf_id;  /* in-kernel BTF type id to attach to */
> >               __u32           attach_prog_fd; /* 0 to attach to vmlinux */
> > +             __u32           attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */
>
> I think the uapi should use attach_btf_obj_fd here.
> Everywhere else uapi is using FDs to point to maps, progs, BTFs of progs.
> BTF of a module isn't different from BTF of a program.
> Looking at libbpf implementation... it has the FD of a module anyway,
> since it needs to fetch it to search for the function btf_id in there.
> So there won't be any inconvenience for libbpf to pass FD in here.
> From the uapi perspective attach_btf_obj_fd will remove potential
> race condition. It's very unlikely race, of course.

Yes, I actually contemplated that, but my preference went the ID way,
because it made libbpf implementation simpler and there was a nice
duality of using ID for types and BTF instances themselves.

The problem with FD is that when I load all module BTF objects, I open
their FD one at a time, and close it as soon as I read BTF raw data
back. If I don't do that on systems with many modules, I'll be keeping
potentially hundreds of FDs open, so I figured I don't want to do
that.

But I do see the FD instead of ID consistency as well, so I can go
with a simple and inefficient implementation of separate FD for each
BTF object for now, and if someone complains, we can teach libbpf to
lazily open FDs of module BTFs that are actually used (later, it will
complicate code unnecessarily). Not really worried about racing with
kernel modules being unloaded.

Also, if we use FD, we might not need a new attach_bpf_obj_id field at
all, we can re-use attach_prog_fd field (put it in union and have
attach_prog_fd/attach_btf_fd). On the kernel side, it would be easy to
check whether provided FD is for bpf_prog or btf. What do you think?
Too mysterious? Or good?

>
> The rest of the series look good to me.

Cool, thanks.
Alexei Starovoitov Dec. 2, 2020, 11:12 p.m. UTC | #3
On Wed, Dec 2, 2020 at 2:43 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Dec 2, 2020 at 12:58 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Dec 01, 2020 at 04:16:12PM -0800, Andrii Nakryiko wrote:
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index c3458ec1f30a..60b95b51ccb8 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -558,6 +558,7 @@ union bpf_attr {
> > >               __u32           line_info_cnt;  /* number of bpf_line_info records */
> > >               __u32           attach_btf_id;  /* in-kernel BTF type id to attach to */
> > >               __u32           attach_prog_fd; /* 0 to attach to vmlinux */
> > > +             __u32           attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */
> >
> > I think the uapi should use attach_btf_obj_fd here.
> > Everywhere else uapi is using FDs to point to maps, progs, BTFs of progs.
> > BTF of a module isn't different from BTF of a program.
> > Looking at libbpf implementation... it has the FD of a module anyway,
> > since it needs to fetch it to search for the function btf_id in there.
> > So there won't be any inconvenience for libbpf to pass FD in here.
> > From the uapi perspective attach_btf_obj_fd will remove potential
> > race condition. It's very unlikely race, of course.
>
> Yes, I actually contemplated that, but my preference went the ID way,
> because it made libbpf implementation simpler and there was a nice
> duality of using ID for types and BTF instances themselves.
>
> The problem with FD is that when I load all module BTF objects, I open
> their FD one at a time, and close it as soon as I read BTF raw data
> back. If I don't do that on systems with many modules, I'll be keeping
> potentially hundreds of FDs open, so I figured I don't want to do
> that.
>
> But I do see the FD instead of ID consistency as well, so I can go
> with a simple and inefficient implementation of separate FD for each
> BTF object for now, and if someone complains, we can teach libbpf to
> lazily open FDs of module BTFs that are actually used (later, it will
> complicate code unnecessarily). Not really worried about racing with
> kernel modules being unloaded.
>
> Also, if we use FD, we might not need a new attach_bpf_obj_id field at
> all, we can re-use attach_prog_fd field (put it in union and have
> attach_prog_fd/attach_btf_fd). On the kernel side, it would be easy to
> check whether provided FD is for bpf_prog or btf. What do you think?
> Too mysterious? Or good?

You mean like:
union {
         __u32           attach_prog_fd; /* valid prog_fd to attach to
bpf prog */
         __u32           attach_btf_obj_fd; /* or  valid module BTF
object fd or zero to attach to vmlinux */
};
or don't introduce a new field name at all?
Sure. I'm fine with both. I think it's a good idea.
Andrii Nakryiko Dec. 2, 2020, 11:14 p.m. UTC | #4
On Wed, Dec 2, 2020 at 3:12 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Dec 2, 2020 at 2:43 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, Dec 2, 2020 at 12:58 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Tue, Dec 01, 2020 at 04:16:12PM -0800, Andrii Nakryiko wrote:
> > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > index c3458ec1f30a..60b95b51ccb8 100644
> > > > --- a/include/uapi/linux/bpf.h
> > > > +++ b/include/uapi/linux/bpf.h
> > > > @@ -558,6 +558,7 @@ union bpf_attr {
> > > >               __u32           line_info_cnt;  /* number of bpf_line_info records */
> > > >               __u32           attach_btf_id;  /* in-kernel BTF type id to attach to */
> > > >               __u32           attach_prog_fd; /* 0 to attach to vmlinux */
> > > > +             __u32           attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */
> > >
> > > I think the uapi should use attach_btf_obj_fd here.
> > > Everywhere else uapi is using FDs to point to maps, progs, BTFs of progs.
> > > BTF of a module isn't different from BTF of a program.
> > > Looking at libbpf implementation... it has the FD of a module anyway,
> > > since it needs to fetch it to search for the function btf_id in there.
> > > So there won't be any inconvenience for libbpf to pass FD in here.
> > > From the uapi perspective attach_btf_obj_fd will remove potential
> > > race condition. It's very unlikely race, of course.
> >
> > Yes, I actually contemplated that, but my preference went the ID way,
> > because it made libbpf implementation simpler and there was a nice
> > duality of using ID for types and BTF instances themselves.
> >
> > The problem with FD is that when I load all module BTF objects, I open
> > their FD one at a time, and close it as soon as I read BTF raw data
> > back. If I don't do that on systems with many modules, I'll be keeping
> > potentially hundreds of FDs open, so I figured I don't want to do
> > that.
> >
> > But I do see the FD instead of ID consistency as well, so I can go
> > with a simple and inefficient implementation of separate FD for each
> > BTF object for now, and if someone complains, we can teach libbpf to
> > lazily open FDs of module BTFs that are actually used (later, it will
> > complicate code unnecessarily). Not really worried about racing with
> > kernel modules being unloaded.
> >
> > Also, if we use FD, we might not need a new attach_bpf_obj_id field at
> > all, we can re-use attach_prog_fd field (put it in union and have
> > attach_prog_fd/attach_btf_fd). On the kernel side, it would be easy to
> > check whether provided FD is for bpf_prog or btf. What do you think?
> > Too mysterious? Or good?
>
> You mean like:
> union {
>          __u32           attach_prog_fd; /* valid prog_fd to attach to
> bpf prog */
>          __u32           attach_btf_obj_fd; /* or  valid module BTF
> object fd or zero to attach to vmlinux */
> };

like this with union, an aliased field name with a meaningful name

> or don't introduce a new field name at all?
> Sure. I'm fine with both. I think it's a good idea.

ok, will do this then
diff mbox series

Patch

diff --git a/include/linux/btf.h b/include/linux/btf.h
index fb608e4de076..53ce2bc6dc54 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -89,7 +89,9 @@  int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
 			   char *buf, int len, u64 flags);
 
 int btf_get_fd_by_id(u32 id);
+struct btf *btf_get_by_id(int id);
 u32 btf_obj_id(const struct btf *btf);
+bool btf_is_kernel(const struct btf *btf);
 bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
 			   const struct btf_member *m,
 			   u32 expected_offset, u32 expected_size);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c3458ec1f30a..60b95b51ccb8 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -558,6 +558,7 @@  union bpf_attr {
 		__u32		line_info_cnt;	/* number of bpf_line_info records */
 		__u32		attach_btf_id;	/* in-kernel BTF type id to attach to */
 		__u32		attach_prog_fd; /* 0 to attach to vmlinux */
+		__u32		attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 7a19bf5bfe97..12876b272c6b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5652,6 +5652,19 @@  struct btf *btf_get_by_fd(int fd)
 	return btf;
 }
 
+struct btf *btf_get_by_id(int id)
+{
+	struct btf *btf;
+
+	rcu_read_lock();
+	btf = idr_find(&btf_idr, id);
+	if (!btf || !refcount_inc_not_zero(&btf->refcnt))
+		btf = ERR_PTR(-ENOENT);
+	rcu_read_unlock();
+
+	return btf;
+}
+
 int btf_get_info_by_fd(const struct btf *btf,
 		       const union bpf_attr *attr,
 		       union bpf_attr __user *uattr)
@@ -5717,12 +5730,7 @@  int btf_get_fd_by_id(u32 id)
 	struct btf *btf;
 	int fd;
 
-	rcu_read_lock();
-	btf = idr_find(&btf_idr, id);
-	if (!btf || !refcount_inc_not_zero(&btf->refcnt))
-		btf = ERR_PTR(-ENOENT);
-	rcu_read_unlock();
-
+	btf = btf_get_by_id(id);
 	if (IS_ERR(btf))
 		return PTR_ERR(btf);
 
@@ -5738,6 +5746,11 @@  u32 btf_obj_id(const struct btf *btf)
 	return btf->id;
 }
 
+bool btf_is_kernel(const struct btf *btf)
+{
+	return btf->kernel_btf;
+}
+
 static int btf_id_cmp_func(const void *a, const void *b)
 {
 	const int *pa = a, *pb = b;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 5ee00611af53..3af073642664 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1968,7 +1968,7 @@  static void bpf_prog_load_fixup_attach_type(union bpf_attr *attr)
 static int
 bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 			   enum bpf_attach_type expected_attach_type,
-			   u32 btf_id, u32 prog_fd)
+			   u32 btf_obj_id, u32 btf_id, u32 prog_fd)
 {
 	if (btf_id) {
 		if (btf_id > BTF_MAX_TYPE)
@@ -1985,6 +1985,9 @@  bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 		}
 	}
 
+	if (btf_obj_id && (!btf_id || prog_fd))
+		return -EINVAL;
+
 	if (prog_fd && prog_type != BPF_PROG_TYPE_TRACING &&
 	    prog_type != BPF_PROG_TYPE_EXT)
 		return -EINVAL;
@@ -2097,7 +2100,7 @@  static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 }
 
 /* last field in 'union bpf_attr' used by this command */
-#define	BPF_PROG_LOAD_LAST_FIELD attach_prog_fd
+#define	BPF_PROG_LOAD_LAST_FIELD attach_btf_obj_id
 
 static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 {
@@ -2146,6 +2149,7 @@  static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 
 	bpf_prog_load_fixup_attach_type(attr);
 	if (bpf_prog_load_check_attach(type, attr->expected_attach_type,
+				       attr->attach_btf_obj_id,
 				       attr->attach_btf_id,
 				       attr->attach_prog_fd))
 		return -EINVAL;
@@ -2158,7 +2162,19 @@  static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 	prog->expected_attach_type = attr->expected_attach_type;
 	prog->aux->attach_btf_id = attr->attach_btf_id;
 
-	if (attr->attach_btf_id && !attr->attach_prog_fd) {
+	if (attr->attach_btf_obj_id) {
+		struct btf *btf;
+
+		btf = btf_get_by_id(attr->attach_btf_obj_id);
+		if (IS_ERR(btf))
+			return PTR_ERR(btf);
+		if (!btf_is_kernel(btf)) {
+			btf_put(btf);
+			return -EINVAL;
+		}
+
+		prog->aux->attach_btf = btf;
+	} else if (attr->attach_btf_id && !attr->attach_prog_fd) {
 		struct btf *btf;
 
 		btf = bpf_get_btf_vmlinux();
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c3458ec1f30a..60b95b51ccb8 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -558,6 +558,7 @@  union bpf_attr {
 		__u32		line_info_cnt;	/* number of bpf_line_info records */
 		__u32		attach_btf_id;	/* in-kernel BTF type id to attach to */
 		__u32		attach_prog_fd; /* 0 to attach to vmlinux */
+		__u32		attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */