mbox series

[bpf,V2,0/2] bpf: adjust uapi for devmap prior to kernel release

Message ID 159170947966.2102545.14401752480810420709.stgit@firesoul
Headers show
Series bpf: adjust uapi for devmap prior to kernel release | expand

Message

Jesper Dangaard Brouer June 9, 2020, 1:31 p.m. UTC
For special type maps (e.g. devmap and cpumap) the map-value data-layout is
a configuration interface. This is uapi that can only be tail extended.
Thus, new members (and thus features) can only be added to the end of this
structure, and the kernel uses the map->value_size from userspace to
determine feature set 'version'.

For this kind of uapi to be extensible and backward compatible, is it common
that new members/fields (that represent a new feature) in the struct are
initialized as zero, which indicate that the feature isn't used. This makes
it possible to write userspace applications that are unaware of new kernel
features, but just include latest uapi headers, zero-init struct and
populate features it knows about.

The recent extension of devmap with a bpf_prog.fd requires end-user to
supply the file-descriptor value minus-1 to communicate that the features
isn't used. This isn't compatible with the described kABI extension model.

V2: Drop patch-1 that changed BPF-syscall to start at file-descriptor 1

---

Jesper Dangaard Brouer (2):
      bpf: devmap adjust uapi for attach bpf program
      bpf: selftests and tools use struct bpf_devmap_val from uapi


 include/uapi/linux/bpf.h                           |   13 +++++++++++++
 kernel/bpf/devmap.c                                |   17 ++++-------------
 tools/include/uapi/linux/bpf.h                     |   13 +++++++++++++
 .../selftests/bpf/prog_tests/xdp_devmap_attach.c   |    8 --------
 .../selftests/bpf/progs/test_xdp_devmap_helpers.c  |    2 +-
 .../bpf/progs/test_xdp_with_devmap_helpers.c       |    3 +--
 6 files changed, 32 insertions(+), 24 deletions(-)

--

Comments

Alexei Starovoitov June 9, 2020, 7:03 p.m. UTC | #1
On Tue, Jun 09, 2020 at 03:31:41PM +0200, Jesper Dangaard Brouer wrote:
> For special type maps (e.g. devmap and cpumap) the map-value data-layout is
> a configuration interface. This is uapi that can only be tail extended.
> Thus, new members (and thus features) can only be added to the end of this
> structure, and the kernel uses the map->value_size from userspace to
> determine feature set 'version'.
> 
> For this kind of uapi to be extensible and backward compatible, is it common
> that new members/fields (that represent a new feature) in the struct are
> initialized as zero, which indicate that the feature isn't used. This makes
> it possible to write userspace applications that are unaware of new kernel
> features, but just include latest uapi headers, zero-init struct and
> populate features it knows about.
> 
> The recent extension of devmap with a bpf_prog.fd requires end-user to
> supply the file-descriptor value minus-1 to communicate that the features
> isn't used. This isn't compatible with the described kABI extension model.

Applied to bpf tree without this cover letter, because I don't want
folks to read above and start using kabi terminology liks this.
I've never seen a definition of kabi. I've heard redhat has something, but
I don't know what it is and really not interested to find out.
Studying amd64 psABI, sparc psABI, gABI was enough of time sink.
When folks use ABI they really mean binary. 
Old binaries that use devmap_val will work as-is with newer kernel.
There is no binary breakage due to devmap_val.
Whereas what you describe above is what will happen if something gets
recompiled. It's an API quirk. And arguable not an UAPI breakage.
UAPI structs have to initialized.
There is a struct and there is initializer for it.
Like if you did 'spinlock_t lock;' and it got broken with new kernel
it's programmers fault. It's not uapi and certainly not abi issue.
DEFINE_SPINLOCK() should have been used.
Same thing with user space.
'struct bpf_devmap_val' would be ok from uapi pov even with -1.
It's just much more convenient to have zero init. Less error prone, etc.