[bpf-next,02/12] net: Add BPF_XDP_EGRESS as a bpf_attach_type
diff mbox series

Message ID 20200123014210.38412-3-dsahern@kernel.org
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series
  • Add support for XDP in egress path
Related show

Commit Message

David Ahern Jan. 23, 2020, 1:42 a.m. UTC
From: Prashant Bhole <prashantbhole.linux@gmail.com>

Add new bpf_attach_type, BPF_XDP_EGRESS, for BPF programs attached
at the XDP layer, but the egress path.

Since egress path does not have rx_queue_index and ingress_ifindex set,
update xdp_is_valid_access to block access to these entries in the xdp
context when a program is attached to egress path.

Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: David Ahern <dahern@digitalocean.com>
---
 include/uapi/linux/bpf.h       | 1 +
 net/core/filter.c              | 8 ++++++++
 tools/include/uapi/linux/bpf.h | 1 +
 3 files changed, 10 insertions(+)

Comments

Toke Høiland-Jørgensen Jan. 23, 2020, 11:34 a.m. UTC | #1
David Ahern <dsahern@kernel.org> writes:

> From: Prashant Bhole <prashantbhole.linux@gmail.com>
>
> Add new bpf_attach_type, BPF_XDP_EGRESS, for BPF programs attached
> at the XDP layer, but the egress path.
>
> Since egress path does not have rx_queue_index and ingress_ifindex set,
> update xdp_is_valid_access to block access to these entries in the xdp
> context when a program is attached to egress path.

Isn't the whole point of this to be able to use unchanged XDP programs?
But now you're introducing a semantic difference. Since supposedly only
point-to-point links are going to be using this attach type, don't they
know enough about their peer device to be able to populate those fields
with meaningful values, instead of restricting access to them?

-Toke
David Ahern Jan. 23, 2020, 9:32 p.m. UTC | #2
On 1/23/20 4:34 AM, Toke Høiland-Jørgensen wrote:
> David Ahern <dsahern@kernel.org> writes:
> 
>> From: Prashant Bhole <prashantbhole.linux@gmail.com>
>>
>> Add new bpf_attach_type, BPF_XDP_EGRESS, for BPF programs attached
>> at the XDP layer, but the egress path.
>>
>> Since egress path does not have rx_queue_index and ingress_ifindex set,
>> update xdp_is_valid_access to block access to these entries in the xdp
>> context when a program is attached to egress path.
> 
> Isn't the whole point of this to be able to use unchanged XDP programs?

See patch 12. Only the userspace code was changed to load the same
program with the egress attach type set.

The verifier needs to check the egress program does not access Rx only
entries in xdp_md context. The attach type allows that check.

> But now you're introducing a semantic difference. Since supposedly only
> point-to-point links are going to be using this attach type, don't they
> know enough about their peer device to be able to populate those fields
> with meaningful values, instead of restricting access to them?
> 

You are conflating use cases. Don't assume point to point or peer devices.

This could be a REDIRECT from eth0 to eth1 and then an EGRESS program on
eth1 to do something. In the current test scenario it is REDIRECT from
eth0 to tapN and then on tapN run an egress program (Tx for a tap is
ingress to the VM).
Martin KaFai Lau Jan. 24, 2020, 7:33 a.m. UTC | #3
On Wed, Jan 22, 2020 at 06:42:00PM -0700, David Ahern wrote:
> From: Prashant Bhole <prashantbhole.linux@gmail.com>
> 
> Add new bpf_attach_type, BPF_XDP_EGRESS, for BPF programs attached
> at the XDP layer, but the egress path.
> 
> Since egress path does not have rx_queue_index and ingress_ifindex set,
> update xdp_is_valid_access to block access to these entries in the xdp
> context when a program is attached to egress path.
> 
> Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
> Signed-off-by: David Ahern <dahern@digitalocean.com>
> ---
>  include/uapi/linux/bpf.h       | 1 +
>  net/core/filter.c              | 8 ++++++++
>  tools/include/uapi/linux/bpf.h | 1 +
>  3 files changed, 10 insertions(+)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 033d90a2282d..72f2a9a4621e 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -209,6 +209,7 @@ enum bpf_attach_type {
>  	BPF_TRACE_RAW_TP,
>  	BPF_TRACE_FENTRY,
>  	BPF_TRACE_FEXIT,
> +	BPF_XDP_EGRESS,
>  	__MAX_BPF_ATTACH_TYPE
>  };
>  
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 17de6747d9e3..a903f3a15d74 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -6803,6 +6803,14 @@ static bool xdp_is_valid_access(int off, int size,
>  		return false;
>  	}
>  
> +	if (prog->expected_attach_type == BPF_XDP_EGRESS) {
For BPF_PROG_TYPE_XDP, the expected_attach_type is currently not
enforced to be 0 in bpf_prog_load_check_attach().  Not sure if it
is ok to test it here and also return false in some of the
following switch cases.

> +		switch (off) {
> +		case offsetof(struct xdp_md, rx_queue_index):
> +		case offsetof(struct xdp_md, ingress_ifindex):
> +			return false;
> +		}
> +	}
> +
>  	switch (off) {
>  	case offsetof(struct xdp_md, data):
>  		info->reg_type = PTR_TO_PACKET;
Toke Høiland-Jørgensen Jan. 24, 2020, 9:49 a.m. UTC | #4
David Ahern <dsahern@gmail.com> writes:

> On 1/23/20 4:34 AM, Toke Høiland-Jørgensen wrote:
>> David Ahern <dsahern@kernel.org> writes:
>> 
>>> From: Prashant Bhole <prashantbhole.linux@gmail.com>
>>>
>>> Add new bpf_attach_type, BPF_XDP_EGRESS, for BPF programs attached
>>> at the XDP layer, but the egress path.
>>>
>>> Since egress path does not have rx_queue_index and ingress_ifindex set,
>>> update xdp_is_valid_access to block access to these entries in the xdp
>>> context when a program is attached to egress path.
>> 
>> Isn't the whole point of this to be able to use unchanged XDP programs?
>
> See patch 12. Only the userspace code was changed to load the same
> program with the egress attach type set.
>
> The verifier needs to check the egress program does not access Rx only
> entries in xdp_md context. The attach type allows that check.
>
>> But now you're introducing a semantic difference. Since supposedly only
>> point-to-point links are going to be using this attach type, don't they
>> know enough about their peer device to be able to populate those fields
>> with meaningful values, instead of restricting access to them?
>> 
>
> You are conflating use cases. Don't assume point to point or peer devices.
>
> This could be a REDIRECT from eth0 to eth1 and then an EGRESS program on
> eth1 to do something. In the current test scenario it is REDIRECT from
> eth0 to tapN and then on tapN run an egress program (Tx for a tap is
> ingress to the VM).

But why would any hardware driver implement this program type, instead
of the proper TX hook? I thought the whole idea of this "third"
not-quite-TX program type was the virtualisation offload case. If you
just want to be able to run eBPF code in the TX path, why not implement
the proper TX hook straight away?

-Toke

Patch
diff mbox series

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 033d90a2282d..72f2a9a4621e 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -209,6 +209,7 @@  enum bpf_attach_type {
 	BPF_TRACE_RAW_TP,
 	BPF_TRACE_FENTRY,
 	BPF_TRACE_FEXIT,
+	BPF_XDP_EGRESS,
 	__MAX_BPF_ATTACH_TYPE
 };
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 17de6747d9e3..a903f3a15d74 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6803,6 +6803,14 @@  static bool xdp_is_valid_access(int off, int size,
 		return false;
 	}
 
+	if (prog->expected_attach_type == BPF_XDP_EGRESS) {
+		switch (off) {
+		case offsetof(struct xdp_md, rx_queue_index):
+		case offsetof(struct xdp_md, ingress_ifindex):
+			return false;
+		}
+	}
+
 	switch (off) {
 	case offsetof(struct xdp_md, data):
 		info->reg_type = PTR_TO_PACKET;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 033d90a2282d..72f2a9a4621e 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -209,6 +209,7 @@  enum bpf_attach_type {
 	BPF_TRACE_RAW_TP,
 	BPF_TRACE_FENTRY,
 	BPF_TRACE_FEXIT,
+	BPF_XDP_EGRESS,
 	__MAX_BPF_ATTACH_TYPE
 };