diff mbox

[v2,net-next,1/5] bpf: Refactor cgroups code in prep for new type

Message ID 1477529922-4806-2-git-send-email-dsa@cumulusnetworks.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

David Ahern Oct. 27, 2016, 12:58 a.m. UTC
Code move only and rename only; no functional change intended.

v2
- fix bpf_prog_run_clear_cb to bpf_prog_run_save_cb as caught by Daniel

- rename BPF_PROG_TYPE_CGROUP_SKB and its cg_skb functions to
  BPF_PROG_TYPE_CGROUP and cgroup

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 include/uapi/linux/bpf.h        |  2 +-
 kernel/bpf/cgroup.c             | 27 ++++++++++++++++++++++-----
 kernel/bpf/syscall.c            |  2 +-
 net/core/filter.c               | 14 +++++++-------
 samples/bpf/test_cgrp2_attach.c |  4 ++--
 5 files changed, 33 insertions(+), 16 deletions(-)

Comments

David Miller Oct. 31, 2016, 4:58 p.m. UTC | #1
From: David Ahern <dsa@cumulusnetworks.com>
Date: Wed, 26 Oct 2016 17:58:38 -0700

> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 6b62ee9a2f78..73da296c2125 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -98,7 +98,7 @@ enum bpf_prog_type {
>  	BPF_PROG_TYPE_TRACEPOINT,
>  	BPF_PROG_TYPE_XDP,
>  	BPF_PROG_TYPE_PERF_EVENT,
> -	BPF_PROG_TYPE_CGROUP_SKB,
> +	BPF_PROG_TYPE_CGROUP,
>  };
>  
>  enum bpf_attach_type {

If we do this then the cgroup-bpf series should use this value rather than
changing it after-the-fact in your series here.
Daniel Mack Oct. 31, 2016, 5 p.m. UTC | #2
On 10/31/2016 05:58 PM, David Miller wrote:
> From: David Ahern <dsa@cumulusnetworks.com>
> Date: Wed, 26 Oct 2016 17:58:38 -0700
> 
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 6b62ee9a2f78..73da296c2125 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -98,7 +98,7 @@ enum bpf_prog_type {
>>  	BPF_PROG_TYPE_TRACEPOINT,
>>  	BPF_PROG_TYPE_XDP,
>>  	BPF_PROG_TYPE_PERF_EVENT,
>> -	BPF_PROG_TYPE_CGROUP_SKB,
>> +	BPF_PROG_TYPE_CGROUP,
>>  };
>>  
>>  enum bpf_attach_type {
> 
> If we do this then the cgroup-bpf series should use this value rather than
> changing it after-the-fact in your series here.
> 

Yeah, I'm confused too. I changed that name in my v7 from
BPF_PROG_TYPE_CGROUP_SOCK to BPF_PROG_TYPE_CGROUP_SKB on David's (Ahern)
request. Why is it now renamed again?


Thanks,
Daniel
David Ahern Oct. 31, 2016, 5:05 p.m. UTC | #3
On 10/31/16 11:00 AM, Daniel Mack wrote:
> On 10/31/2016 05:58 PM, David Miller wrote:
>> From: David Ahern <dsa@cumulusnetworks.com>
>> Date: Wed, 26 Oct 2016 17:58:38 -0700
>>
>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>>> index 6b62ee9a2f78..73da296c2125 100644
>>> --- a/include/uapi/linux/bpf.h
>>> +++ b/include/uapi/linux/bpf.h
>>> @@ -98,7 +98,7 @@ enum bpf_prog_type {
>>>  	BPF_PROG_TYPE_TRACEPOINT,
>>>  	BPF_PROG_TYPE_XDP,
>>>  	BPF_PROG_TYPE_PERF_EVENT,
>>> -	BPF_PROG_TYPE_CGROUP_SKB,
>>> +	BPF_PROG_TYPE_CGROUP,
>>>  };
>>>  
>>>  enum bpf_attach_type {
>>
>> If we do this then the cgroup-bpf series should use this value rather than
>> changing it after-the-fact in your series here.
>>
> 
> Yeah, I'm confused too. I changed that name in my v7 from
> BPF_PROG_TYPE_CGROUP_SOCK to BPF_PROG_TYPE_CGROUP_SKB on David's (Ahern)
> request. Why is it now renamed again?

Thomas pushed back on adding another program type in favor of using subtypes. So this makes the program type generic to CGROUP and patch 2 in this v2 set added Mickaël's subtype patch with the socket mangling done that way in patch 3.
Daniel Mack Oct. 31, 2016, 5:16 p.m. UTC | #4
On 10/31/2016 06:05 PM, David Ahern wrote:
> On 10/31/16 11:00 AM, Daniel Mack wrote:
>> On 10/31/2016 05:58 PM, David Miller wrote:
>>> From: David Ahern <dsa@cumulusnetworks.com> Date: Wed, 26 Oct
>>> 2016 17:58:38 -0700
>>> 
>>>> diff --git a/include/uapi/linux/bpf.h
>>>> b/include/uapi/linux/bpf.h index 6b62ee9a2f78..73da296c2125
>>>> 100644 --- a/include/uapi/linux/bpf.h +++
>>>> b/include/uapi/linux/bpf.h @@ -98,7 +98,7 @@ enum bpf_prog_type
>>>> { BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_XDP, 
>>>> BPF_PROG_TYPE_PERF_EVENT, -	BPF_PROG_TYPE_CGROUP_SKB, +
>>>> BPF_PROG_TYPE_CGROUP, };
>>>> 
>>>> enum bpf_attach_type {
>>> 
>>> If we do this then the cgroup-bpf series should use this value
>>> rather than changing it after-the-fact in your series here.
>>> 
>> 
>> Yeah, I'm confused too. I changed that name in my v7 from 
>> BPF_PROG_TYPE_CGROUP_SOCK to BPF_PROG_TYPE_CGROUP_SKB on David's
>> (Ahern) request. Why is it now renamed again?
> 
> Thomas pushed back on adding another program type in favor of using
> subtypes. So this makes the program type generic to CGROUP and patch
> 2 in this v2 set added Mickaël's subtype patch with the socket
> mangling done that way in patch 3.
> 

Fine for me. I can change it around again.


Thanks,
Daniel
Thomas Graf Oct. 31, 2016, 5:49 p.m. UTC | #5
On 10/31/16 at 06:16pm, Daniel Mack wrote:
> On 10/31/2016 06:05 PM, David Ahern wrote:
> > On 10/31/16 11:00 AM, Daniel Mack wrote:
> >> Yeah, I'm confused too. I changed that name in my v7 from 
> >> BPF_PROG_TYPE_CGROUP_SOCK to BPF_PROG_TYPE_CGROUP_SKB on David's
> >> (Ahern) request. Why is it now renamed again?
> > 
> > Thomas pushed back on adding another program type in favor of using
> > subtypes. So this makes the program type generic to CGROUP and patch
> > 2 in this v2 set added Mickaël's subtype patch with the socket
> > mangling done that way in patch 3.
> > 
> 
> Fine for me. I can change it around again.

I would like to hear from Daniel B and Alexei as well. We need to
decide whether to use subtypes consistently and treat prog types as
something more high level or whether to bluntly introduce a new prog
type for every distinct set of verifier limits. I will change lwt_bpf
as well accordingly.
David Ahern Nov. 14, 2016, 3:51 a.m. UTC | #6
On 10/31/16 11:49 AM, Thomas Graf wrote:
> On 10/31/16 at 06:16pm, Daniel Mack wrote:
>> On 10/31/2016 06:05 PM, David Ahern wrote:
>>> On 10/31/16 11:00 AM, Daniel Mack wrote:
>>>> Yeah, I'm confused too. I changed that name in my v7 from 
>>>> BPF_PROG_TYPE_CGROUP_SOCK to BPF_PROG_TYPE_CGROUP_SKB on David's
>>>> (Ahern) request. Why is it now renamed again?
>>>
>>> Thomas pushed back on adding another program type in favor of using
>>> subtypes. So this makes the program type generic to CGROUP and patch
>>> 2 in this v2 set added Mickaël's subtype patch with the socket
>>> mangling done that way in patch 3.
>>>
>>
>> Fine for me. I can change it around again.
> 
> I would like to hear from Daniel B and Alexei as well. We need to
> decide whether to use subtypes consistently and treat prog types as
> something more high level or whether to bluntly introduce a new prog
> type for every distinct set of verifier limits. I will change lwt_bpf
> as well accordingly.
> 

Alexei / Daniel - any comments/preferences on subtypes vs program types?
Alexei Starovoitov Nov. 15, 2016, 2:23 a.m. UTC | #7
On 11/13/16 7:51 PM, David Ahern wrote:
> On 10/31/16 11:49 AM, Thomas Graf wrote:
>> On 10/31/16 at 06:16pm, Daniel Mack wrote:
>>> On 10/31/2016 06:05 PM, David Ahern wrote:
>>>> On 10/31/16 11:00 AM, Daniel Mack wrote:
>>>>> Yeah, I'm confused too. I changed that name in my v7 from
>>>>> BPF_PROG_TYPE_CGROUP_SOCK to BPF_PROG_TYPE_CGROUP_SKB on David's
>>>>> (Ahern) request. Why is it now renamed again?
>>>>
>>>> Thomas pushed back on adding another program type in favor of using
>>>> subtypes. So this makes the program type generic to CGROUP and patch
>>>> 2 in this v2 set added Mickaël's subtype patch with the socket
>>>> mangling done that way in patch 3.
>>>>
>>>
>>> Fine for me. I can change it around again.
>>
>> I would like to hear from Daniel B and Alexei as well. We need to
>> decide whether to use subtypes consistently and treat prog types as
>> something more high level or whether to bluntly introduce a new prog
>> type for every distinct set of verifier limits. I will change lwt_bpf
>> as well accordingly.
>>
>
> Alexei / Daniel - any comments/preferences on subtypes vs program types?

looks like in this particular case it's better to use different program
types, since they all serve different purpose and context is different.
Daniel Mack's programs can stay BPF_PROG_TYPE_CGROUP_SKB and operate on 
skb, whereas your ifindex changing programs can be 
BPF_PROG_TYPE_CGROUP_SOCK.

regarding DanielM's patches.. Dave's comment regarding skb->sk vs sk
made us think hard about it. We looked into tunnels and at the end
realized that skb->sk won't fly, since the program will be called
multiple times on tx for every ip_output. And since we were not sure
that using 'sk' will fix it, we had to do a bunch of tests with
different tunnels and analyze the code to make sure it's all good.
We'll repost soon.
diff mbox

Patch

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 6b62ee9a2f78..73da296c2125 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -98,7 +98,7 @@  enum bpf_prog_type {
 	BPF_PROG_TYPE_TRACEPOINT,
 	BPF_PROG_TYPE_XDP,
 	BPF_PROG_TYPE_PERF_EVENT,
-	BPF_PROG_TYPE_CGROUP_SKB,
+	BPF_PROG_TYPE_CGROUP,
 };
 
 enum bpf_attach_type {
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index a0ab43f264b0..d5746aec8f34 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -117,6 +117,19 @@  void __cgroup_bpf_update(struct cgroup *cgrp,
 	}
 }
 
+static int __cgroup_bpf_run_filter_skb(struct sk_buff *skb,
+				       struct bpf_prog *prog)
+{
+	unsigned int offset = skb->data - skb_network_header(skb);
+	int ret;
+
+	__skb_push(skb, offset);
+	ret = bpf_prog_run_save_cb(prog, skb) == 1 ? 0 : -EPERM;
+	__skb_pull(skb, offset);
+
+	return ret;
+}
+
 /**
  * __cgroup_bpf_run_filter() - Run a program for packet filtering
  * @sk: The socken sending or receiving traffic
@@ -153,11 +166,15 @@  int __cgroup_bpf_run_filter(struct sock *sk,
 
 	prog = rcu_dereference(cgrp->bpf.effective[type]);
 	if (prog) {
-		unsigned int offset = skb->data - skb_network_header(skb);
-
-		__skb_push(skb, offset);
-		ret = bpf_prog_run_save_cb(prog, skb) == 1 ? 0 : -EPERM;
-		__skb_pull(skb, offset);
+		switch (type) {
+		case BPF_CGROUP_INET_INGRESS:
+		case BPF_CGROUP_INET_EGRESS:
+			ret = __cgroup_bpf_run_filter_skb(skb, prog);
+			break;
+		/* make gcc happy else complains about missing enum value */
+		default:
+			return 0;
+		}
 	}
 
 	rcu_read_unlock();
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 1814c010ace6..3c4c9ed2d576 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -841,7 +841,7 @@  static int bpf_prog_attach(const union bpf_attr *attr)
 	case BPF_CGROUP_INET_INGRESS:
 	case BPF_CGROUP_INET_EGRESS:
 		prog = bpf_prog_get_type(attr->attach_bpf_fd,
-					 BPF_PROG_TYPE_CGROUP_SKB);
+					 BPF_PROG_TYPE_CGROUP);
 		if (IS_ERR(prog))
 			return PTR_ERR(prog);
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 4552b8c93b99..61e229c3a862 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2583,7 +2583,7 @@  xdp_func_proto(enum bpf_func_id func_id)
 }
 
 static const struct bpf_func_proto *
-cg_skb_func_proto(enum bpf_func_id func_id)
+cgroup_func_proto(enum bpf_func_id func_id)
 {
 	switch (func_id) {
 	case BPF_FUNC_skb_load_bytes:
@@ -2955,8 +2955,8 @@  static const struct bpf_verifier_ops xdp_ops = {
 	.convert_ctx_access	= xdp_convert_ctx_access,
 };
 
-static const struct bpf_verifier_ops cg_skb_ops = {
-	.get_func_proto		= cg_skb_func_proto,
+static const struct bpf_verifier_ops cgroup_ops = {
+	.get_func_proto		= cgroup_func_proto,
 	.is_valid_access	= sk_filter_is_valid_access,
 	.convert_ctx_access	= sk_filter_convert_ctx_access,
 };
@@ -2981,9 +2981,9 @@  static struct bpf_prog_type_list xdp_type __read_mostly = {
 	.type	= BPF_PROG_TYPE_XDP,
 };
 
-static struct bpf_prog_type_list cg_skb_type __read_mostly = {
-	.ops	= &cg_skb_ops,
-	.type	= BPF_PROG_TYPE_CGROUP_SKB,
+static struct bpf_prog_type_list cgroup_type __read_mostly = {
+	.ops	= &cgroup_ops,
+	.type	= BPF_PROG_TYPE_CGROUP,
 };
 
 static int __init register_sk_filter_ops(void)
@@ -2992,7 +2992,7 @@  static int __init register_sk_filter_ops(void)
 	bpf_register_prog_type(&sched_cls_type);
 	bpf_register_prog_type(&sched_act_type);
 	bpf_register_prog_type(&xdp_type);
-	bpf_register_prog_type(&cg_skb_type);
+	bpf_register_prog_type(&cgroup_type);
 
 	return 0;
 }
diff --git a/samples/bpf/test_cgrp2_attach.c b/samples/bpf/test_cgrp2_attach.c
index 63ef2083f766..468e68e6d511 100644
--- a/samples/bpf/test_cgrp2_attach.c
+++ b/samples/bpf/test_cgrp2_attach.c
@@ -69,8 +69,8 @@  static int prog_load(int map_fd, int verdict)
 		BPF_EXIT_INSN(),
 	};
 
-	return bpf_prog_load(BPF_PROG_TYPE_CGROUP_SKB,
-			     prog, sizeof(prog), "GPL", 0);
+	return bpf_prog_load(BPF_PROG_TYPE_CGROUP,
+			     prog, sizeof(prog), "GPL", 0, NULL);
 }
 
 static int usage(const char *argv0)