Message ID | 159163507753.1967373.62249862728421448.stgit@firesoul |
---|---|
State | Changes Requested |
Delegated to: | BPF Maintainers |
Headers | show |
Series | bpf: avoid using/returning file descriptor value zero | expand |
Jesper Dangaard Brouer <brouer@redhat.com> writes: > This patch change BPF syscall to avoid returning file descriptor value zero. > > As mentioned in cover letter, it is very impractical when extending kABI > that the file-descriptor value 'zero' is valid, as this requires new fields > must be initialised as minus-1. First step is to change the kernel such that > BPF-syscall simply doesn't return value zero as a FD number. > > This patch achieves this by similar code to anon_inode_getfd(), with the > exception of getting unused FD starting from 1. The kernel already supports > starting from a specific FD value, as this is used by f_dupfd(). It seems > simpler to replicate part of anon_inode_getfd() code and use this start from > offset feature, instead of using f_dupfd() handling afterwards. > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > --- > fs/file.c | 2 +- > include/linux/file.h | 1 + > kernel/bpf/syscall.c | 38 ++++++++++++++++++++++++++++++++------ > 3 files changed, 34 insertions(+), 7 deletions(-) > > diff --git a/fs/file.c b/fs/file.c > index abb8b7081d7a..122185cb7707 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -535,7 +535,7 @@ int __alloc_fd(struct files_struct *files, > return error; > } > > -static int alloc_fd(unsigned start, unsigned flags) > +int alloc_fd(unsigned start, unsigned flags) Missing an EXPORT_SYMBOL() to go with this. > { > return __alloc_fd(current->files, start, rlimit(RLIMIT_NOFILE), flags); > } > diff --git a/include/linux/file.h b/include/linux/file.h > index 122f80084a3e..927fb6c2582d 100644 > --- a/include/linux/file.h > +++ b/include/linux/file.h > @@ -85,6 +85,7 @@ extern int f_dupfd(unsigned int from, struct file *file, unsigned flags); > extern int replace_fd(unsigned fd, struct file *file, unsigned flags); > extern void set_close_on_exec(unsigned int fd, int flag); > extern bool get_close_on_exec(unsigned int fd); > +extern int alloc_fd(unsigned start, unsigned flags); > extern int __get_unused_fd_flags(unsigned flags, unsigned long nofile); > extern int get_unused_fd_flags(unsigned flags); > extern void put_unused_fd(unsigned int fd); > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index 4d530b1d5683..6eba236aacd1 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -688,6 +688,32 @@ const struct file_operations bpf_map_fops = { > .poll = bpf_map_poll, > }; > > +/* Code is similar to anon_inode_getfd(), except starts at FD 1 */ > +int bpf_anon_inode_getfd(const char *name, const struct file_operations *fops, > + void *priv, int flags) > +{ I think it's a little sad that we have to essentially re-implement anon_inode_getfd(). I guess an alternative could be to add an extra parameter to the existing function, either with a different name and a wrapper function, or just change all the callers (by my count there are only 13 call sites outside of BPF). Not sure if that is better, though? > + int error, fd; > + struct file *file; > + > + error = alloc_fd(1, flags); > + if (error < 0) > + return error; > + fd = error; > + > + file = anon_inode_getfile(name, fops, priv, flags); > + if (IS_ERR(file)) { > + error = PTR_ERR(file); > + goto err_put_unused_fd; > + } > + fd_install(fd, file); > + > + return fd; > + > +err_put_unused_fd: > + put_unused_fd(fd); > + return error; > +} > + > int bpf_map_new_fd(struct bpf_map *map, int flags) > { > int ret; > @@ -696,8 +722,8 @@ int bpf_map_new_fd(struct bpf_map *map, int flags) > if (ret < 0) > return ret; > > - return anon_inode_getfd("bpf-map", &bpf_map_fops, map, > - flags | O_CLOEXEC); > + return bpf_anon_inode_getfd("bpf-map", &bpf_map_fops, map, > + flags | O_CLOEXEC); > } > > int bpf_get_file_flag(int flags) > @@ -1840,8 +1866,8 @@ int bpf_prog_new_fd(struct bpf_prog *prog) > if (ret < 0) > return ret; > > - return anon_inode_getfd("bpf-prog", &bpf_prog_fops, prog, > - O_RDWR | O_CLOEXEC); > + return bpf_anon_inode_getfd("bpf-prog", &bpf_prog_fops, prog, > + O_RDWR | O_CLOEXEC); > } > > static struct bpf_prog *____bpf_prog_get(struct fd f) > @@ -2471,7 +2497,7 @@ int bpf_link_settle(struct bpf_link_primer *primer) > > int bpf_link_new_fd(struct bpf_link *link) > { > - return anon_inode_getfd("bpf-link", &bpf_link_fops, link, O_CLOEXEC); > + return bpf_anon_inode_getfd("bpf-link", &bpf_link_fops, link, O_CLOEXEC); > } > > struct bpf_link *bpf_link_get_from_fd(u32 ufd) > @@ -4024,7 +4050,7 @@ static int bpf_enable_runtime_stats(void) > return -EBUSY; > } > > - fd = anon_inode_getfd("bpf-stats", &bpf_stats_fops, NULL, O_CLOEXEC); > + fd = bpf_anon_inode_getfd("bpf-stats", &bpf_stats_fops, NULL, O_CLOEXEC); > if (fd >= 0) > static_key_slow_inc(&bpf_stats_enabled_key.key); > I guess you should also fix __btf_new_fd() in btf.c?
On Mon, Jun 8, 2020 at 9:51 AM Jesper Dangaard Brouer <brouer@redhat.com> wrote: > > This patch change BPF syscall to avoid returning file descriptor value zero. > > As mentioned in cover letter, it is very impractical when extending kABI > that the file-descriptor value 'zero' is valid, as this requires new fields > must be initialised as minus-1. First step is to change the kernel such that > BPF-syscall simply doesn't return value zero as a FD number. > > This patch achieves this by similar code to anon_inode_getfd(), with the > exception of getting unused FD starting from 1. The kernel already supports > starting from a specific FD value, as this is used by f_dupfd(). It seems > simpler to replicate part of anon_inode_getfd() code and use this start from > offset feature, instead of using f_dupfd() handling afterwards. Wouldn't it be better to just handle that on libbpf side? That way it works on all kernels and doesn't require this duplication of logic inside kernel? > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > --- > fs/file.c | 2 +- > include/linux/file.h | 1 + > kernel/bpf/syscall.c | 38 ++++++++++++++++++++++++++++++++------ > 3 files changed, 34 insertions(+), 7 deletions(-) > [...]
Hi Jesper, I love your patch! Perhaps something to improve: [auto build test WARNING on bpf/master] url: https://github.com/0day-ci/linux/commits/Jesper-Dangaard-Brouer/bpf-avoid-using-returning-file-descriptor-value-zero/20200609-005457 base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master config: um-allmodconfig (attached as .config) compiler: gcc-9 (Debian 9.3.0-13) 9.3.0 reproduce (this is a W=1 build): # save the attached .config to linux build tree make W=1 ARCH=um If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>, old ones prefixed by <<): cc1: warning: arch/um/include/uapi: No such file or directory [-Wmissing-include-dirs] In file included from include/linux/kernel.h:11, from include/linux/list.h:9, from include/linux/timer.h:5, from include/linux/workqueue.h:9, from include/linux/bpf.h:9, from kernel/bpf/syscall.c:4: include/asm-generic/fixmap.h: In function 'fix_to_virt': include/asm-generic/fixmap.h:32:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] 32 | BUILD_BUG_ON(idx >= __end_of_fixed_addresses); | ^~ include/linux/compiler.h:383:9: note: in definition of macro '__compiletime_assert' 383 | if (!(condition)) | ^~~~~~~~~ include/linux/compiler.h:403:2: note: in expansion of macro '_compiletime_assert' 403 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) | ^~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:50:2: note: in expansion of macro 'BUILD_BUG_ON_MSG' 50 | BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) | ^~~~~~~~~~~~~~~~ include/asm-generic/fixmap.h:32:2: note: in expansion of macro 'BUILD_BUG_ON' 32 | BUILD_BUG_ON(idx >= __end_of_fixed_addresses); | ^~~~~~~~~~~~ In file included from include/linux/uaccess.h:11, from include/linux/sched/task.h:11, from include/linux/sched/signal.h:9, from include/linux/rcuwait.h:6, from include/linux/percpu-rwsem.h:7, from include/linux/fs.h:34, from include/linux/huge_mm.h:8, from include/linux/mm.h:675, from include/linux/kallsyms.h:12, from include/linux/bpf.h:21, from kernel/bpf/syscall.c:4: arch/um/include/asm/uaccess.h: In function '__access_ok': arch/um/include/asm/uaccess.h:17:29: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] 17 | (((unsigned long) (addr) >= FIXADDR_USER_START) && | ^~ arch/um/include/asm/uaccess.h:45:3: note: in expansion of macro '__access_ok_vsyscall' 45 | __access_ok_vsyscall(addr, size) || | ^~~~~~~~~~~~~~~~~~~~ kernel/bpf/syscall.c: At top level: >> kernel/bpf/syscall.c:692:5: warning: no previous prototype for 'bpf_anon_inode_getfd' [-Wmissing-prototypes] 692 | int bpf_anon_inode_getfd(const char *name, const struct file_operations *fops, | ^~~~~~~~~~~~~~~~~~~~ vim +/bpf_anon_inode_getfd +692 kernel/bpf/syscall.c 690 691 /* Code is similar to anon_inode_getfd(), except starts at FD 1 */ > 692 int bpf_anon_inode_getfd(const char *name, const struct file_operations *fops, 693 void *priv, int flags) 694 { 695 int error, fd; 696 struct file *file; 697 698 error = alloc_fd(1, flags); 699 if (error < 0) 700 return error; 701 fd = error; 702 703 file = anon_inode_getfile(name, fops, priv, flags); 704 if (IS_ERR(file)) { 705 error = PTR_ERR(file); 706 goto err_put_unused_fd; 707 } 708 fd_install(fd, file); 709 710 return fd; 711 712 err_put_unused_fd: 713 put_unused_fd(fd); 714 return error; 715 } 716 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Mon, 8 Jun 2020 11:36:33 -0700 Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > On Mon, Jun 8, 2020 at 9:51 AM Jesper Dangaard Brouer <brouer@redhat.com> wrote: > > > > This patch change BPF syscall to avoid returning file descriptor value zero. > > > > As mentioned in cover letter, it is very impractical when extending kABI > > that the file-descriptor value 'zero' is valid, as this requires new fields > > must be initialised as minus-1. First step is to change the kernel such that > > BPF-syscall simply doesn't return value zero as a FD number. > > > > This patch achieves this by similar code to anon_inode_getfd(), with the > > exception of getting unused FD starting from 1. The kernel already supports > > starting from a specific FD value, as this is used by f_dupfd(). It seems > > simpler to replicate part of anon_inode_getfd() code and use this start from > > offset feature, instead of using f_dupfd() handling afterwards. > > Wouldn't it be better to just handle that on libbpf side? That way it > works on all kernels and doesn't require this duplication of logic > inside kernel? IMHO this is needed on the kernel side, to pair it with the API change. I don't see how doing this in libbpf can cover all cases. First of all, some users might not be using libbpf. Second, a userspace application could be using an older version of libbpf on a newer kernel. (Notice this is not only due to older distros, but also because projects using git submodule libbpf will freeze at some point in time that worked).
Hi Jesper,
I love your patch! Perhaps something to improve:
[auto build test WARNING on bpf/master]
url: https://github.com/0day-ci/linux/commits/Jesper-Dangaard-Brouer/bpf-avoid-using-returning-file-descriptor-value-zero/20200609-005457
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master
config: i386-randconfig-s001-20200609 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-13) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.1-247-gcadbd124-dirty
# save the attached .config to linux build tree
make W=1 C=1 ARCH=i386 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
sparse warnings: (new ones prefixed by >>)
>> kernel/bpf/syscall.c:692:5: sparse: sparse: symbol 'bpf_anon_inode_getfd' was not declared. Should it be static?
Please review and possibly fold the followup patch.
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
-- Andrii On Mon, Jun 8, 2020 at 12:45 PM Jesper Dangaard Brouer <brouer@redhat.com> wrote: > > On Mon, 8 Jun 2020 11:36:33 -0700 > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > On Mon, Jun 8, 2020 at 9:51 AM Jesper Dangaard Brouer <brouer@redhat.com> wrote: > > > > > > This patch change BPF syscall to avoid returning file descriptor value zero. > > > > > > As mentioned in cover letter, it is very impractical when extending kABI > > > that the file-descriptor value 'zero' is valid, as this requires new fields > > > must be initialised as minus-1. First step is to change the kernel such that > > > BPF-syscall simply doesn't return value zero as a FD number. > > > > > > This patch achieves this by similar code to anon_inode_getfd(), with the > > > exception of getting unused FD starting from 1. The kernel already supports > > > starting from a specific FD value, as this is used by f_dupfd(). It seems > > > simpler to replicate part of anon_inode_getfd() code and use this start from > > > offset feature, instead of using f_dupfd() handling afterwards. > > > > Wouldn't it be better to just handle that on libbpf side? That way it > > works on all kernels and doesn't require this duplication of logic > > inside kernel? > > IMHO this is needed on the kernel side, to pair it with the API change. > I don't see how doing this in libbpf can cover all cases. > In practice, it's going to be very rare that fd=0 is not already allocated for application. So even not doing anything is going to work in 99.9% of cases. Think about this, any realistic BPF program will have a map or global variable associated with it. So in the rare case where we have FD 0 not used, map will get that FD. Even if not, if you load your program from ELF file, that ELF file will get FD 0. Even if not, modern BPF programs will have BTF object associated, which will get FD 0. And so on... Even daemons will probably already have some FD open for whatever logging/output it needs to do (it doesn't have to be stdout). So this FD=0 is very hypothetical case, very. You have to essentially stage it consciously, to actually hit it. Second of all, this BPF-specific FD allocation logic is just that -- duplication and extra code to maintain. Other folks in kernel will eventually just be "huh? bpf needs its own anon_file API, why?!..." Do we really want more of that? Third, you already missed anon_inode_getfile use in bpf_link_prime(), and in the future we'll undoubtedly will miss something else, so this FD >= 1 from kernel will work only sometimes and no one will notice until it breaks for someone, which won't happen for a while (because it works as is most of the time, see above). I'm not even talking about the fact that we are also subverting standard Linux promise that a user gets the smallest available FD in the system... So I think libbpf can be kind to user and do: int fd = bpf_whatever(); if (fd == 0) { fd = dup(0); close(0); } But even if it doesn't, not a big deal and probably no one ever will have this problem with FD 0 for BPF program. > First of all, some users might not be using libbpf. > > Second, a userspace application could be using an older version of > libbpf on a newer kernel. (Notice this is not only due to older > distros, but also because projects using git submodule libbpf will > freeze at some point in time that worked). Theoretically this is a problem, in practice libbpf is always more up-to-date compared to kernel... so I don't buy this argument, honestly :) > > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer >
Hi Jesper, I love your patch! Perhaps something to improve: [auto build test WARNING on bpf/master] url: https://github.com/0day-ci/linux/commits/Jesper-Dangaard-Brouer/bpf-avoid-using-returning-file-descriptor-value-zero/20200609-005457 base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master config: x86_64-allyesconfig (attached as .config) compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project e429cffd4f228f70c1d9df0e5d77c08590dd9766) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install x86_64 cross compiling tool for clang build # apt-get install binutils-x86-64-linux-gnu # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>, old ones prefixed by <<): >> kernel/bpf/syscall.c:692:5: warning: no previous prototype for function 'bpf_anon_inode_getfd' [-Wmissing-prototypes] int bpf_anon_inode_getfd(const char *name, const struct file_operations *fops, ^ kernel/bpf/syscall.c:692:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int bpf_anon_inode_getfd(const char *name, const struct file_operations *fops, ^ static 1 warning generated. vim +/bpf_anon_inode_getfd +692 kernel/bpf/syscall.c 690 691 /* Code is similar to anon_inode_getfd(), except starts at FD 1 */ > 692 int bpf_anon_inode_getfd(const char *name, const struct file_operations *fops, 693 void *priv, int flags) 694 { 695 int error, fd; 696 struct file *file; 697 698 error = alloc_fd(1, flags); 699 if (error < 0) 700 return error; 701 fd = error; 702 703 file = anon_inode_getfile(name, fops, priv, flags); 704 if (IS_ERR(file)) { 705 error = PTR_ERR(file); 706 goto err_put_unused_fd; 707 } 708 fd_install(fd, file); 709 710 return fd; 711 712 err_put_unused_fd: 713 put_unused_fd(fd); 714 return error; 715 } 716 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Jesper, I love your patch! Perhaps something to improve: [auto build test WARNING on bpf/master] url: https://github.com/0day-ci/linux/commits/Jesper-Dangaard-Brouer/bpf-avoid-using-returning-file-descriptor-value-zero/20200609-005457 base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master config: x86_64-randconfig-r003-20200608 (attached as .config) compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project e429cffd4f228f70c1d9df0e5d77c08590dd9766) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install x86_64 cross compiling tool for clang build # apt-get install binutils-x86-64-linux-gnu # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>, old ones prefixed by <<): >> kernel/bpf/syscall.c:692:5: warning: no previous prototype for function 'bpf_anon_inode_getfd' [-Wmissing-prototypes] int bpf_anon_inode_getfd(const char *name, const struct file_operations *fops, ^ kernel/bpf/syscall.c:692:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int bpf_anon_inode_getfd(const char *name, const struct file_operations *fops, ^ static 1 warning generated. vim +/bpf_anon_inode_getfd +692 kernel/bpf/syscall.c 690 691 /* Code is similar to anon_inode_getfd(), except starts at FD 1 */ > 692 int bpf_anon_inode_getfd(const char *name, const struct file_operations *fops, 693 void *priv, int flags) 694 { 695 int error, fd; 696 struct file *file; 697 698 error = alloc_fd(1, flags); 699 if (error < 0) 700 return error; 701 fd = error; 702 703 file = anon_inode_getfile(name, fops, priv, flags); 704 if (IS_ERR(file)) { 705 error = PTR_ERR(file); 706 goto err_put_unused_fd; 707 } 708 fd_install(fd, file); 709 710 return fd; 711 712 err_put_unused_fd: 713 put_unused_fd(fd); 714 return error; 715 } 716 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
diff --git a/fs/file.c b/fs/file.c index abb8b7081d7a..122185cb7707 100644 --- a/fs/file.c +++ b/fs/file.c @@ -535,7 +535,7 @@ int __alloc_fd(struct files_struct *files, return error; } -static int alloc_fd(unsigned start, unsigned flags) +int alloc_fd(unsigned start, unsigned flags) { return __alloc_fd(current->files, start, rlimit(RLIMIT_NOFILE), flags); } diff --git a/include/linux/file.h b/include/linux/file.h index 122f80084a3e..927fb6c2582d 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -85,6 +85,7 @@ extern int f_dupfd(unsigned int from, struct file *file, unsigned flags); extern int replace_fd(unsigned fd, struct file *file, unsigned flags); extern void set_close_on_exec(unsigned int fd, int flag); extern bool get_close_on_exec(unsigned int fd); +extern int alloc_fd(unsigned start, unsigned flags); extern int __get_unused_fd_flags(unsigned flags, unsigned long nofile); extern int get_unused_fd_flags(unsigned flags); extern void put_unused_fd(unsigned int fd); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 4d530b1d5683..6eba236aacd1 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -688,6 +688,32 @@ const struct file_operations bpf_map_fops = { .poll = bpf_map_poll, }; +/* Code is similar to anon_inode_getfd(), except starts at FD 1 */ +int bpf_anon_inode_getfd(const char *name, const struct file_operations *fops, + void *priv, int flags) +{ + int error, fd; + struct file *file; + + error = alloc_fd(1, flags); + if (error < 0) + return error; + fd = error; + + file = anon_inode_getfile(name, fops, priv, flags); + if (IS_ERR(file)) { + error = PTR_ERR(file); + goto err_put_unused_fd; + } + fd_install(fd, file); + + return fd; + +err_put_unused_fd: + put_unused_fd(fd); + return error; +} + int bpf_map_new_fd(struct bpf_map *map, int flags) { int ret; @@ -696,8 +722,8 @@ int bpf_map_new_fd(struct bpf_map *map, int flags) if (ret < 0) return ret; - return anon_inode_getfd("bpf-map", &bpf_map_fops, map, - flags | O_CLOEXEC); + return bpf_anon_inode_getfd("bpf-map", &bpf_map_fops, map, + flags | O_CLOEXEC); } int bpf_get_file_flag(int flags) @@ -1840,8 +1866,8 @@ int bpf_prog_new_fd(struct bpf_prog *prog) if (ret < 0) return ret; - return anon_inode_getfd("bpf-prog", &bpf_prog_fops, prog, - O_RDWR | O_CLOEXEC); + return bpf_anon_inode_getfd("bpf-prog", &bpf_prog_fops, prog, + O_RDWR | O_CLOEXEC); } static struct bpf_prog *____bpf_prog_get(struct fd f) @@ -2471,7 +2497,7 @@ int bpf_link_settle(struct bpf_link_primer *primer) int bpf_link_new_fd(struct bpf_link *link) { - return anon_inode_getfd("bpf-link", &bpf_link_fops, link, O_CLOEXEC); + return bpf_anon_inode_getfd("bpf-link", &bpf_link_fops, link, O_CLOEXEC); } struct bpf_link *bpf_link_get_from_fd(u32 ufd) @@ -4024,7 +4050,7 @@ static int bpf_enable_runtime_stats(void) return -EBUSY; } - fd = anon_inode_getfd("bpf-stats", &bpf_stats_fops, NULL, O_CLOEXEC); + fd = bpf_anon_inode_getfd("bpf-stats", &bpf_stats_fops, NULL, O_CLOEXEC); if (fd >= 0) static_key_slow_inc(&bpf_stats_enabled_key.key);
This patch change BPF syscall to avoid returning file descriptor value zero. As mentioned in cover letter, it is very impractical when extending kABI that the file-descriptor value 'zero' is valid, as this requires new fields must be initialised as minus-1. First step is to change the kernel such that BPF-syscall simply doesn't return value zero as a FD number. This patch achieves this by similar code to anon_inode_getfd(), with the exception of getting unused FD starting from 1. The kernel already supports starting from a specific FD value, as this is used by f_dupfd(). It seems simpler to replicate part of anon_inode_getfd() code and use this start from offset feature, instead of using f_dupfd() handling afterwards. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> --- fs/file.c | 2 +- include/linux/file.h | 1 + kernel/bpf/syscall.c | 38 ++++++++++++++++++++++++++++++++------ 3 files changed, 34 insertions(+), 7 deletions(-)