diff mbox series

[RFC,bpf-next] bpf: add new jited info fields in bpf_dev_offload and bpf_prog_info

Message ID 20180112004747.17698-1-jakub.kicinski@netronome.com
State RFC, archived
Delegated to: BPF Maintainers
Headers show
Series [RFC,bpf-next] bpf: add new jited info fields in bpf_dev_offload and bpf_prog_info | expand

Commit Message

Jakub Kicinski Jan. 12, 2018, 12:47 a.m. UTC
Hi!

Jiong is working on dumping JITed NFP image via bpftool, Francois will be
submitting support for NFP in binutils soon (whoop! :)).

We would appreciate if you could weigh in on the uAPI.  Is it OK to reuse
the existing jited_prog_len/jited_prog_insns or should we add separate
2 new fields (plus the arch name) to avoid confusing old user space?

From: Jiong Wang <jiong.wang@netronome.com>

For host JIT, there are "jited_len"/"bpf_func" fields in struct bpf_prog
used by all host JIT targets to get jited image and it's length. While for
offload, targets are likely to have different offload mechanisms that these
info are kept in device private data fields.

Therefore, BPF_OBJ_GET_INFO_BY_FD syscall needs an unified way to get JIT
length and contents info for offload targets.

One way is to introduce new callback to parse device private data then fill
those fields in bpf_prog_info. This might be a little heavy, the other way
is to add generic fields which will be initialized by all offload targets.

This patch follows the second approach to introduce two new fields in
struct bpf_dev_offload and teach bpf_prog_get_info_by_fd about them to fill
correct jited_prog_len and jited_prog_insns in bpf_prog_info.

Also, currently userspace tools can't get offload architecture info from
bpf_prog_info. This info is necessary for choosing correct disassembler.

This patch add name info in both bpf_dev_offload and bpf_prog_info so it
could be used by tools to select correct architecture.

The code logic in bpf_prog_offload_info_fill is adjusted slightly. Code
that only applies to offload are centered in bpf_prog_offload_info_fill as
much as possible.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 include/linux/bpf.h            |  3 +++
 include/uapi/linux/bpf.h       |  2 ++
 kernel/bpf/offload.c           | 26 ++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  2 ++
 4 files changed, 33 insertions(+)

Comments

Jakub Kicinski Jan. 12, 2018, 2:17 a.m. UTC | #1
On Thu, 11 Jan 2018 16:47:47 -0800, Jakub Kicinski wrote:
> Hi!
> 
> Jiong is working on dumping JITed NFP image via bpftool, Francois will be
> submitting support for NFP in binutils soon (whoop! :)).
> 
> We would appreciate if you could weigh in on the uAPI.  Is it OK to reuse
> the existing jited_prog_len/jited_prog_insns or should we add separate
> 2 new fields (plus the arch name) to avoid confusing old user space?

Ah, I skipped one chunk of Jiong's patch here, this would also be
necessary if we reuse fields:

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 2bac0dc..c7831cd 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1673,19 +1673,6 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 		goto done;
 	}
 
-	ulen = info.jited_prog_len;
-	info.jited_prog_len = prog->jited_len;
-	if (info.jited_prog_len && ulen) {
-		if (bpf_dump_raw_ok()) {
-			uinsns = u64_to_user_ptr(info.jited_prog_insns);
-			ulen = min_t(u32, info.jited_prog_len, ulen);
-			if (copy_to_user(uinsns, prog->bpf_func, ulen))
-				return -EFAULT;
-		} else {
-			info.jited_prog_insns = 0;
-		}
-	}
-
 	ulen = info.xlated_prog_len;
 	info.xlated_prog_len = bpf_prog_insn_size(prog);
 	if (info.xlated_prog_len && ulen) {
@@ -1711,6 +1698,21 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 		err = bpf_prog_offload_info_fill(&info, prog);
 		if (err)
 			return err;
+		else
+			goto done;
+	}
+
+	ulen = info.jited_prog_len;
+	info.jited_prog_len = prog->jited_len;
+	if (info.jited_prog_len && ulen) {
+		if (bpf_dump_raw_ok()) {
+			uinsns = u64_to_user_ptr(info.jited_prog_insns);
+			ulen = min_t(u32, info.jited_prog_len, ulen);
+			if (copy_to_user(uinsns, prog->bpf_func, ulen))
+				return -EFAULT;
+		} else {
+			info.jited_prog_insns = 0;
+		}
 	}
 
 done:

info.jited_prog_len is an in/out parameter, so we can't write it twice
if we share fields..  Sorry for messing up.
Daniel Borkmann Jan. 12, 2018, 11:31 a.m. UTC | #2
On 01/12/2018 03:17 AM, Jakub Kicinski wrote:
> On Thu, 11 Jan 2018 16:47:47 -0800, Jakub Kicinski wrote:
>> Hi!
>>
>> Jiong is working on dumping JITed NFP image via bpftool, Francois will be
>> submitting support for NFP in binutils soon (whoop! :)).
>>
>> We would appreciate if you could weigh in on the uAPI.  Is it OK to reuse
>> the existing jited_prog_len/jited_prog_insns or should we add separate
>> 2 new fields (plus the arch name) to avoid confusing old user space?
> 
> Ah, I skipped one chunk of Jiong's patch here, this would also be
> necessary if we reuse fields:
What kind of string would sit in jited_arch_name for nfp? Given you know
the {ifindex, netns_dev, netns_ino} 3‑tuple, isn't it possible to infer
the driver info from ethtool (e.g. nfp_net_get_drvinfo()) already and do
the mapping for binutils? Given we know when ifindex is 0, then its
always host JITed and the current approach with bfd_openr() works fine,
and if ifindex is non-0 it is /always/ device offloaded to the one bound
in the ifindex so JIT dump will be specific to that device only and
never host one. Not at all opposed to extending uapi, just a question
on my side to get a better understanding first wrt arch string (maybe
rough but complete patch with nfp + bpftool bits might help too).

> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 2bac0dc..c7831cd 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1673,19 +1673,6 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
>  		goto done;
>  	}
>  
> -	ulen = info.jited_prog_len;
> -	info.jited_prog_len = prog->jited_len;
> -	if (info.jited_prog_len && ulen) {
> -		if (bpf_dump_raw_ok()) {
> -			uinsns = u64_to_user_ptr(info.jited_prog_insns);
> -			ulen = min_t(u32, info.jited_prog_len, ulen);
> -			if (copy_to_user(uinsns, prog->bpf_func, ulen))
> -				return -EFAULT;
> -		} else {
> -			info.jited_prog_insns = 0;
> -		}
> -	}
> -
>  	ulen = info.xlated_prog_len;
>  	info.xlated_prog_len = bpf_prog_insn_size(prog);
>  	if (info.xlated_prog_len && ulen) {
> @@ -1711,6 +1698,21 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
>  		err = bpf_prog_offload_info_fill(&info, prog);
>  		if (err)
>  			return err;
> +		else
> +			goto done;
> +	}
> +
> +	ulen = info.jited_prog_len;
> +	info.jited_prog_len = prog->jited_len;
> +	if (info.jited_prog_len && ulen) {
> +		if (bpf_dump_raw_ok()) {
> +			uinsns = u64_to_user_ptr(info.jited_prog_insns);
> +			ulen = min_t(u32, info.jited_prog_len, ulen);
> +			if (copy_to_user(uinsns, prog->bpf_func, ulen))
> +				return -EFAULT;
> +		} else {
> +			info.jited_prog_insns = 0;
> +		}
>  	}
>  
>  done:
> 
> info.jited_prog_len is an in/out parameter, so we can't write it twice
> if we share fields..  Sorry for messing up.
Jakub Kicinski Jan. 12, 2018, 10:17 p.m. UTC | #3
On Fri, 12 Jan 2018 12:31:15 +0100, Daniel Borkmann wrote:
> On 01/12/2018 03:17 AM, Jakub Kicinski wrote:
> > On Thu, 11 Jan 2018 16:47:47 -0800, Jakub Kicinski wrote:  
> >> Hi!
> >>
> >> Jiong is working on dumping JITed NFP image via bpftool, Francois will be
> >> submitting support for NFP in binutils soon (whoop! :)).
> >>
> >> We would appreciate if you could weigh in on the uAPI.  Is it OK to reuse
> >> the existing jited_prog_len/jited_prog_insns or should we add separate
> >> 2 new fields (plus the arch name) to avoid confusing old user space?  
> > 
> > Ah, I skipped one chunk of Jiong's patch here, this would also be
> > necessary if we reuse fields:  
>
> What kind of string would sit in jited_arch_name for nfp? Given you know
> the {ifindex, netns_dev, netns_ino} 3‑tuple, isn't it possible to infer
> the driver info from ethtool (e.g. nfp_net_get_drvinfo()) already and do
> the mapping for binutils? 

Right, the inference can certainly work.  Probably by matching the PCI
ID of the device?  Or we can just assume it's the NFP since there is
only one upstream BPF offload :)

> Given we know when ifindex is 0, then its always host JITed and the
> current approach with bfd_openr() works fine, and if ifindex is non-0
> it is /always/ device offloaded to the one bound in the ifindex so
> JIT dump will be specific to that device only and never host one. Not
> at all opposed to extending uapi, just a question on my side to get a
> better understanding first wrt arch string (maybe rough but complete
> patch with nfp + bpftool bits might help too).

I was advocating for full separate set of fields, Jiong was trying for
a reuse, and we sort of met in the middle :)  Depends on how one looks
at the definition of the jited_prog_insn field, old user space is not
guaranteed to pay attention to ifindex.  We will drop the arch and reuse
host fields if that's the direction you're leaning in.
Jiong Wang Jan. 15, 2018, 12:27 p.m. UTC | #4
> On Fri, 12 Jan 2018 12:31:15 +0100, Daniel Borkmann wrote:
>> What kind of string would sit in jited_arch_name for nfp?

The name is purely to let libfd know which bfd backend to choose for the
disassembler.

>> Given you know
>> the {ifindex, netns_dev, netns_ino} 3‑tuple, isn't it possible to infer
>> the driver info from ethtool (e.g. nfp_net_get_drvinfo()) already and do
>> the mapping for binutils?
> Right, the inference can certainly work.  Probably by matching the PCI
> ID of the device?  Or we can just assume it's the NFP since there is
> only one upstream BPF offload :)

Checked ethtool source code and I am thinking (and had tried) we could just
call the existing "ifindex_to_name_ns" in bpftool which returns the device
name, then we could use ioctl/SIOCETHTOOL to get the "struct ethtool_drvinfo"
associated with the device, then map the driver name to bpf name.

Will send out new patches following this way shortly, please let me know if
this is not good.

>
>> Given we know when ifindex is 0, then its always host JITed and the
>> current approach with bfd_openr() works fine, and if ifindex is non-0
>> it is /always/ device offloaded to the one bound in the ifindex so
>> JIT dump will be specific to that device only and never host one. Not
>> at all opposed to extending uapi, just a question on my side to get a
>> better understanding first wrt arch string (maybe rough but complete
>> patch with nfp + bpftool bits might help too).
> I was advocating for full separate set of fields, Jiong was trying for
> a reuse, and we sort of met in the middle :)  Depends on how one looks
> at the definition of the jited_prog_insn field, old user space is not
> guaranteed to pay attention to ifindex.  We will drop the arch and reuse
> host fields if that's the direction you're leaning in.

I also want to make sure adding new "jited_image" and "jited_len" is the
correct approach.

Was thinking to reusing exsiting fields for host jit, i.e
bpf_prog->jited_len and bpf_prog->aux->jited_data, but different offload
targets might have different structure for "jited_data" that we need new
call back to parse it, also I feel keep host fields clean for host JIT
only might help preventing code logic for host and offload interleaving
with each other, so had gone with adding new fields.
Daniel Borkmann Jan. 15, 2018, 2:07 p.m. UTC | #5
On 01/15/2018 01:27 PM, Jiong Wang wrote:
>> On Fri, 12 Jan 2018 12:31:15 +0100, Daniel Borkmann wrote:
[...]
>>> Given we know when ifindex is 0, then its always host JITed and the
>>> current approach with bfd_openr() works fine, and if ifindex is non-0
>>> it is /always/ device offloaded to the one bound in the ifindex so
>>> JIT dump will be specific to that device only and never host one. Not
>>> at all opposed to extending uapi, just a question on my side to get a
>>> better understanding first wrt arch string (maybe rough but complete
>>> patch with nfp + bpftool bits might help too).
>> I was advocating for full separate set of fields, Jiong was trying for
>> a reuse, and we sort of met in the middle :)  Depends on how one looks
>> at the definition of the jited_prog_insn field, old user space is not
>> guaranteed to pay attention to ifindex.  We will drop the arch and reuse
>> host fields if that's the direction you're leaning in.
> 
> I also want to make sure adding new "jited_image" and "jited_len" is the
> correct approach.

I think it's fine.

> Was thinking to reusing exsiting fields for host jit, i.e
> bpf_prog->jited_len and bpf_prog->aux->jited_data, but different offload
> targets might have different structure for "jited_data" that we need new
> call back to parse it, also I feel keep host fields clean for host JIT
> only might help preventing code logic for host and offload interleaving
> with each other, so had gone with adding new fields.

Agree, one thing less to worry since this also ties into kallsyms, etc.

Ok, lets stick with the current RFC, they seem to be the cleanest approach
overall with having offload_arch_name[] in the uapi and only filled by
driver JITs (and not host JITs).

Thanks,
Daniel
Alexei Starovoitov Jan. 15, 2018, 5:13 p.m. UTC | #6
On Mon, Jan 15, 2018 at 03:07:25PM +0100, Daniel Borkmann wrote:
> On 01/15/2018 01:27 PM, Jiong Wang wrote:
> >> On Fri, 12 Jan 2018 12:31:15 +0100, Daniel Borkmann wrote:
> [...]
> >>> Given we know when ifindex is 0, then its always host JITed and the
> >>> current approach with bfd_openr() works fine, and if ifindex is non-0
> >>> it is /always/ device offloaded to the one bound in the ifindex so
> >>> JIT dump will be specific to that device only and never host one. Not
> >>> at all opposed to extending uapi, just a question on my side to get a
> >>> better understanding first wrt arch string (maybe rough but complete
> >>> patch with nfp + bpftool bits might help too).
> >> I was advocating for full separate set of fields, Jiong was trying for
> >> a reuse, and we sort of met in the middle :)  Depends on how one looks
> >> at the definition of the jited_prog_insn field, old user space is not
> >> guaranteed to pay attention to ifindex.  We will drop the arch and reuse
> >> host fields if that's the direction you're leaning in.
> > 
> > I also want to make sure adding new "jited_image" and "jited_len" is the
> > correct approach.
> 
> I think it's fine.
> 
> > Was thinking to reusing exsiting fields for host jit, i.e
> > bpf_prog->jited_len and bpf_prog->aux->jited_data, but different offload
> > targets might have different structure for "jited_data" that we need new
> > call back to parse it, also I feel keep host fields clean for host JIT
> > only might help preventing code logic for host and offload interleaving
> > with each other, so had gone with adding new fields.
> 
> Agree, one thing less to worry since this also ties into kallsyms, etc.

why would bytes in jited_prog_insns conflict with kallsyms ?

I also agree that adding two extra fields for 'offloaded_prog + len'
is probably cleanest.

> Ok, lets stick with the current RFC, they seem to be the cleanest approach
> overall with having offload_arch_name[] in the uapi and only filled by
> driver JITs (and not host JITs).

I don't like string based interfaces especially when strings are used
by tools to compare with other strings.
imo strings are only good for humans to read.
I think returning ifindex is already enough. From it the tool
can get pci id which is the most accurate identification of the hw.
diff mbox series

Patch

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 9e03046d1df2..d0cb9735bbba 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -197,6 +197,9 @@  struct bpf_dev_offload {
 	struct list_head	offloads;
 	bool			dev_state;
 	const struct bpf_prog_offload_ops *dev_ops;
+	void			*jited_image;
+	u32			jited_len;
+	char			jited_arch_name[BPF_OFFLOAD_ARCH_NAME_LEN];
 };
 
 struct bpf_prog_aux {
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 405317f9c064..124560b982df 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -226,6 +226,7 @@  enum bpf_attach_type {
 #define BPF_F_QUERY_EFFECTIVE	(1U << 0)
 
 #define BPF_OBJ_NAME_LEN 16U
+#define BPF_OFFLOAD_ARCH_NAME_LEN 16U
 
 /* Flags for accessing BPF object */
 #define BPF_F_RDONLY		(1U << 3)
@@ -927,6 +928,7 @@  struct bpf_prog_info {
 	__u32 ifindex;
 	__u64 netns_dev;
 	__u64 netns_ino;
+	char offload_arch_name[BPF_OFFLOAD_ARCH_NAME_LEN];
 } __attribute__((aligned(8)));
 
 struct bpf_map_info {
diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c
index 040d4e0edf3f..88b4396d19aa 100644
--- a/kernel/bpf/offload.c
+++ b/kernel/bpf/offload.c
@@ -216,9 +216,12 @@  int bpf_prog_offload_info_fill(struct bpf_prog_info *info,
 		.prog	= prog,
 		.info	= info,
 	};
+	struct bpf_prog_aux *aux = prog->aux;
 	struct inode *ns_inode;
 	struct path ns_path;
+	char __user *uinsns;
 	void *res;
+	u32 ulen;
 
 	res = ns_get_path_cb(&ns_path, bpf_prog_offload_info_fill_ns, &args);
 	if (IS_ERR(res)) {
@@ -227,6 +230,29 @@  int bpf_prog_offload_info_fill(struct bpf_prog_info *info,
 		return PTR_ERR(res);
 	}
 
+
+	down_read(&bpf_devs_lock);
+	if (!aux->offload) {
+		up_read(&bpf_devs_lock);
+		return -ENODEV;
+	}
+
+	ulen = info->jited_prog_len;
+	info->jited_prog_len = aux->offload->jited_len;
+	if (info->jited_prog_len & ulen) {
+		uinsns = u64_to_user_ptr(info->jited_prog_insns);
+		ulen = min_t(u32, info->jited_prog_len, ulen);
+		if (copy_to_user(uinsns, aux->offload->jited_image, ulen)) {
+			up_read(&bpf_devs_lock);
+			return -EFAULT;
+		}
+	}
+
+	memcpy(info->offload_arch_name, aux->offload->jited_arch_name,
+	       sizeof(info->offload_arch_name));
+
+	up_read(&bpf_devs_lock);
+
 	ns_inode = ns_path.dentry->d_inode;
 	info->netns_dev = new_encode_dev(ns_inode->i_sb->s_dev);
 	info->netns_ino = ns_inode->i_ino;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4e8c60acfa32..647aee66f4da 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -226,6 +226,7 @@  enum bpf_attach_type {
 #define BPF_F_QUERY_EFFECTIVE	(1U << 0)
 
 #define BPF_OBJ_NAME_LEN 16U
+#define BPF_OFFLOAD_ARCH_NAME_LEN 16U
 
 /* Flags for accessing BPF object */
 #define BPF_F_RDONLY		(1U << 3)
@@ -924,6 +925,7 @@  struct bpf_prog_info {
 	__u32 ifindex;
 	__u64 netns_dev;
 	__u64 netns_ino;
+	char offload_arch_name[BPF_OFFLOAD_ARCH_NAME_LEN];
 } __attribute__((aligned(8)));
 
 struct bpf_map_info {