diff mbox

[net-next] bpf: allow BPF programs access 'protocol' and 'vlan_tci' fields

Message ID 1426554362-29991-1-git-send-email-ast@plumgrid.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Alexei Starovoitov March 17, 2015, 1:06 a.m. UTC
as a follow on to patch 70006af95515 ("bpf: allow eBPF access skb fields")
this patch allows 'protocol' and 'vlan_tci' fields to be accessible
from extended BPF programs.

The usage of 'protocol', 'vlan_present' and 'vlan_tci' fields is the same as
corresponding SKF_AD_PROTOCOL, SKF_AD_VLAN_TAG_PRESENT and SKF_AD_VLAN_TAG
accesses in classic BPF.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---

1.
I was thinking to drop ntohs() from 'protocol' field for extended BPF, since
the programs could do:
if (skb->protocol == htons(ETH_P_IP))
which would have saved one or two cpu cycles.
But having similar behavior between classic and extended seems to be better.

2.
'vlan_tci' name is picked to match real sk_buff->vlan_tci field
and matches tpacket's tp_vlan_tci field.

 include/uapi/linux/bpf.h    |    3 ++
 net/core/filter.c           |   72 ++++++++++++++++++++++++++++++-------------
 samples/bpf/test_verifier.c |    9 ++++++
 3 files changed, 62 insertions(+), 22 deletions(-)

Comments

Daniel Borkmann March 17, 2015, 9:22 a.m. UTC | #1
On 03/17/2015 02:06 AM, Alexei Starovoitov wrote:
> as a follow on to patch 70006af95515 ("bpf: allow eBPF access skb fields")
> this patch allows 'protocol' and 'vlan_tci' fields to be accessible
> from extended BPF programs.
>
> The usage of 'protocol', 'vlan_present' and 'vlan_tci' fields is the same as
> corresponding SKF_AD_PROTOCOL, SKF_AD_VLAN_TAG_PRESENT and SKF_AD_VLAN_TAG
> accesses in classic BPF.
>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

Ok, code looks good to me.

> 1.
> I was thinking to drop ntohs() from 'protocol' field for extended BPF, since
> the programs could do:
> if (skb->protocol == htons(ETH_P_IP))
> which would have saved one or two cpu cycles.
> But having similar behavior between classic and extended seems to be better.

I'm thinking that skb->protocol == htons(ETH_P_IP) might actually
be more obvious, and, as you mentioned, the compiler can already
resolve the htons() during compile time instead of runtime, which
would be another plus.

Either behavior we should document later anyway.

The question to me here is, do we need to keep similar behavior?

After all, the way of programming both from a user perspective is
quite different (i.e. bpf_asm versus C/LLVM).

Similarly, I was wondering, if just exporting raw skb->vlan_tci is
already sufficient, and the user can e.g. write helpers to extract
bits himself from that protocol field?

> 2.
> 'vlan_tci' name is picked to match real sk_buff->vlan_tci field
> and matches tpacket's tp_vlan_tci field.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexei Starovoitov March 17, 2015, 5:56 p.m. UTC | #2
On 3/17/15 2:22 AM, Daniel Borkmann wrote:
> On 03/17/2015 02:06 AM, Alexei Starovoitov wrote:
>> as a follow on to patch 70006af95515 ("bpf: allow eBPF access skb
>> fields")
>> this patch allows 'protocol' and 'vlan_tci' fields to be accessible
>> from extended BPF programs.
>>
>> The usage of 'protocol', 'vlan_present' and 'vlan_tci' fields is the
>> same as
>> corresponding SKF_AD_PROTOCOL, SKF_AD_VLAN_TAG_PRESENT and
>> SKF_AD_VLAN_TAG
>> accesses in classic BPF.
>>
>> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
>
> Ok, code looks good to me.
>
>> 1.
>> I was thinking to drop ntohs() from 'protocol' field for extended BPF,
>> since
>> the programs could do:
>> if (skb->protocol == htons(ETH_P_IP))
>> which would have saved one or two cpu cycles.
>> But having similar behavior between classic and extended seems to be
>> better.
>
> I'm thinking that skb->protocol == htons(ETH_P_IP) might actually
> be more obvious, and, as you mentioned, the compiler can already
> resolve the htons() during compile time instead of runtime, which
> would be another plus.
>
> Either behavior we should document later anyway.
>
> The question to me here is, do we need to keep similar behavior?
>
> After all, the way of programming both from a user perspective is
> quite different (i.e. bpf_asm versus C/LLVM).

yeah. we don't have to. Somehow I felt that keeping ntohs will make
it easier for folks moving from classic to extended, but I guess
they're different enough, so no point wasting run time cycles.

> Similarly, I was wondering, if just exporting raw skb->vlan_tci is
> already sufficient, and the user can e.g. write helpers to extract
> bits himself from that protocol field?

yes. I thought about the same. Currently VLAN_TAG_PRESENT bit is not
officially exposed to user space, but implicitly, since that bit
is always cleared when we return tci to user space and it's always
set when drivers indicate that vlan header was present in the packet.
So I think we can return skb->vlan_tci as-is, since it will save
one load in bpf program which will be able to do
if (skb->vlan_tci != 0) /* vlan header is present */
      vid = skb->vlan_tci & 0x0fff;
compiler will optimize above two accesses into single load and will
reuse the register in 2nd line.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann March 17, 2015, 6:14 p.m. UTC | #3
On 03/17/2015 06:56 PM, Alexei Starovoitov wrote:
> On 3/17/15 2:22 AM, Daniel Borkmann wrote:
>> On 03/17/2015 02:06 AM, Alexei Starovoitov wrote:
...
>>> I was thinking to drop ntohs() from 'protocol' field for extended BPF,
>>> since
>>> the programs could do:
>>> if (skb->protocol == htons(ETH_P_IP))
>>> which would have saved one or two cpu cycles.
>>> But having similar behavior between classic and extended seems to be
>>> better.
>>
>> I'm thinking that skb->protocol == htons(ETH_P_IP) might actually
>> be more obvious, and, as you mentioned, the compiler can already
>> resolve the htons() during compile time instead of runtime, which
>> would be another plus.
>>
>> Either behavior we should document later anyway.
>>
>> The question to me here is, do we need to keep similar behavior?
>>
>> After all, the way of programming both from a user perspective is
>> quite different (i.e. bpf_asm versus C/LLVM).
>
> yeah. we don't have to. Somehow I felt that keeping ntohs will make
> it easier for folks moving from classic to extended, but I guess
> they're different enough, so no point wasting run time cycles.

Yes, I think that case seems reasonable in my opinion.

>> Similarly, I was wondering, if just exporting raw skb->vlan_tci is
>> already sufficient, and the user can e.g. write helpers to extract
>> bits himself from that protocol field?
>
> yes. I thought about the same. Currently VLAN_TAG_PRESENT bit is not
> officially exposed to user space, but implicitly, since that bit
> is always cleared when we return tci to user space and it's always
> set when drivers indicate that vlan header was present in the packet.

Right.

> So I think we can return skb->vlan_tci as-is, since it will save
> one load in bpf program which will be able to do
> if (skb->vlan_tci != 0) /* vlan header is present */
>       vid = skb->vlan_tci & 0x0fff;
> compiler will optimize above two accesses into single load and will
> reuse the register in 2nd line.

Ok, I'm not sure what's best in the vlan_tci case. I think both
options are a possible way to move forward. If the compiler can
further optimize the latter, it might be the better option. I'll
leave that to you. ;)

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 17, 2015, 7:06 p.m. UTC | #4
From: Alexei Starovoitov <ast@plumgrid.com>
Date: Mon, 16 Mar 2015 18:06:02 -0700

> as a follow on to patch 70006af95515 ("bpf: allow eBPF access skb fields")
> this patch allows 'protocol' and 'vlan_tci' fields to be accessible
> from extended BPF programs.
> 
> The usage of 'protocol', 'vlan_present' and 'vlan_tci' fields is the same as
> corresponding SKF_AD_PROTOCOL, SKF_AD_VLAN_TAG_PRESENT and SKF_AD_VLAN_TAG
> accesses in classic BPF.
> 
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 929545a27546..1623047af463 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -178,6 +178,9 @@  struct __sk_buff {
 	__u32 pkt_type;
 	__u32 mark;
 	__u32 queue_mapping;
+	__u32 protocol;
+	__u32 vlan_present;
+	__u32 vlan_tci;
 };
 
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/net/core/filter.c b/net/core/filter.c
index 4e9dd0ad0d5b..b95ae7fe7e4f 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -177,6 +177,35 @@  static u32 convert_skb_access(int skb_field, int dst_reg, int src_reg,
 		*insn++ = BPF_LDX_MEM(BPF_H, dst_reg, src_reg,
 				      offsetof(struct sk_buff, queue_mapping));
 		break;
+
+	case SKF_AD_PROTOCOL:
+		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, protocol) != 2);
+
+		/* dst_reg = *(u16 *) (src_reg + offsetof(protocol)) */
+		*insn++ = BPF_LDX_MEM(BPF_H, dst_reg, src_reg,
+				      offsetof(struct sk_buff, protocol));
+		/* dst_reg = ntohs(dst_reg) [emitting a nop or swap16] */
+		*insn++ = BPF_ENDIAN(BPF_FROM_BE, dst_reg, 16);
+		break;
+
+	case SKF_AD_VLAN_TAG:
+	case SKF_AD_VLAN_TAG_PRESENT:
+		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_tci) != 2);
+		BUILD_BUG_ON(VLAN_TAG_PRESENT != 0x1000);
+
+		/* dst_reg = *(u16 *) (src_reg + offsetof(vlan_tci)) */
+		*insn++ = BPF_LDX_MEM(BPF_H, dst_reg, src_reg,
+				      offsetof(struct sk_buff, vlan_tci));
+		if (skb_field == SKF_AD_VLAN_TAG) {
+			*insn++ = BPF_ALU32_IMM(BPF_AND, dst_reg,
+						~VLAN_TAG_PRESENT);
+		} else {
+			/* dst_reg >>= 12 */
+			*insn++ = BPF_ALU32_IMM(BPF_RSH, dst_reg, 12);
+			/* dst_reg &= 1 */
+			*insn++ = BPF_ALU32_IMM(BPF_AND, dst_reg, 1);
+		}
+		break;
 	}
 
 	return insn - insn_buf;
@@ -190,13 +219,8 @@  static bool convert_bpf_extensions(struct sock_filter *fp,
 
 	switch (fp->k) {
 	case SKF_AD_OFF + SKF_AD_PROTOCOL:
-		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, protocol) != 2);
-
-		/* A = *(u16 *) (CTX + offsetof(protocol)) */
-		*insn++ = BPF_LDX_MEM(BPF_H, BPF_REG_A, BPF_REG_CTX,
-				      offsetof(struct sk_buff, protocol));
-		/* A = ntohs(A) [emitting a nop or swap16] */
-		*insn = BPF_ENDIAN(BPF_FROM_BE, BPF_REG_A, 16);
+		cnt = convert_skb_access(SKF_AD_PROTOCOL, BPF_REG_A, BPF_REG_CTX, insn);
+		insn += cnt - 1;
 		break;
 
 	case SKF_AD_OFF + SKF_AD_PKTTYPE:
@@ -242,22 +266,15 @@  static bool convert_bpf_extensions(struct sock_filter *fp,
 		break;
 
 	case SKF_AD_OFF + SKF_AD_VLAN_TAG:
-	case SKF_AD_OFF + SKF_AD_VLAN_TAG_PRESENT:
-		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_tci) != 2);
-		BUILD_BUG_ON(VLAN_TAG_PRESENT != 0x1000);
+		cnt = convert_skb_access(SKF_AD_VLAN_TAG,
+					 BPF_REG_A, BPF_REG_CTX, insn);
+		insn += cnt - 1;
+		break;
 
-		/* A = *(u16 *) (CTX + offsetof(vlan_tci)) */
-		*insn++ = BPF_LDX_MEM(BPF_H, BPF_REG_A, BPF_REG_CTX,
-				      offsetof(struct sk_buff, vlan_tci));
-		if (fp->k == SKF_AD_OFF + SKF_AD_VLAN_TAG) {
-			*insn = BPF_ALU32_IMM(BPF_AND, BPF_REG_A,
-					      ~VLAN_TAG_PRESENT);
-		} else {
-			/* A >>= 12 */
-			*insn++ = BPF_ALU32_IMM(BPF_RSH, BPF_REG_A, 12);
-			/* A &= 1 */
-			*insn = BPF_ALU32_IMM(BPF_AND, BPF_REG_A, 1);
-		}
+	case SKF_AD_OFF + SKF_AD_VLAN_TAG_PRESENT:
+		cnt = convert_skb_access(SKF_AD_VLAN_TAG_PRESENT,
+					 BPF_REG_A, BPF_REG_CTX, insn);
+		insn += cnt - 1;
 		break;
 
 	case SKF_AD_OFF + SKF_AD_PAY_OFFSET:
@@ -1215,6 +1232,17 @@  static u32 sk_filter_convert_ctx_access(int dst_reg, int src_reg, int ctx_off,
 
 	case offsetof(struct __sk_buff, queue_mapping):
 		return convert_skb_access(SKF_AD_QUEUE, dst_reg, src_reg, insn);
+
+	case offsetof(struct __sk_buff, protocol):
+		return convert_skb_access(SKF_AD_PROTOCOL, dst_reg, src_reg, insn);
+
+	case offsetof(struct __sk_buff, vlan_present):
+		return convert_skb_access(SKF_AD_VLAN_TAG_PRESENT,
+					  dst_reg, src_reg, insn);
+
+	case offsetof(struct __sk_buff, vlan_tci):
+		return convert_skb_access(SKF_AD_VLAN_TAG,
+					  dst_reg, src_reg, insn);
 	}
 
 	return insn - insn_buf;
diff --git a/samples/bpf/test_verifier.c b/samples/bpf/test_verifier.c
index df6dbb6576f6..75d561f9fd6a 100644
--- a/samples/bpf/test_verifier.c
+++ b/samples/bpf/test_verifier.c
@@ -658,6 +658,15 @@  static struct bpf_test tests[] = {
 			BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1,
 				    offsetof(struct __sk_buff, queue_mapping)),
 			BPF_JMP_IMM(BPF_JGE, BPF_REG_0, 0, 0),
+			BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1,
+				    offsetof(struct __sk_buff, protocol)),
+			BPF_JMP_IMM(BPF_JGE, BPF_REG_0, 0, 0),
+			BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1,
+				    offsetof(struct __sk_buff, vlan_present)),
+			BPF_JMP_IMM(BPF_JGE, BPF_REG_0, 0, 0),
+			BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1,
+				    offsetof(struct __sk_buff, vlan_tci)),
+			BPF_JMP_IMM(BPF_JGE, BPF_REG_0, 0, 0),
 			BPF_EXIT_INSN(),
 		},
 		.result = ACCEPT,