Message ID | 1425501718-12066-1-git-send-email-msekleta@redhat.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On 3/4/15 12:41 PM, Michal Sekletar wrote: > This commit introduces new BPF extension. It makes possible to load value of > skb->vlan_proto (vlan tpid) to register A. > > Currently, vlan header is removed from frame and information is available to > userspace only via tpacket interface. Hence, it is not possible to install > filter which uses value of vlan tpid field. > > AFAICT only way how to filter based on tpid value is to reconstruct original > frame encapsulation and interpret BPF filter code in userspace. Doing that is > way slower than doing filtering in kernel. > > Cc: Alexei Starovoitov <ast@plumgrid.com> > Cc: Jiri Pirko <jpirko@redhat.com> > Signed-off-by: Michal Sekletar <msekleta@redhat.com> > --- > @@ -282,6 +282,7 @@ Possible BPF extensions are shown in the following table: > vlan_tci skb_vlan_tag_get(skb) > vlan_pr skb_vlan_tag_present(skb) > rand prandom_u32() > + vlan_proto skb->vlan_proto the patch is correct and looks clean, but I don't understand the motivation for the patch. There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. If there another vlan header inside the packet, it's AD. So you can do the filtering already without adding new bpf extension... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mar 4, 2015, at 1:03 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: > the patch is correct and looks clean, but I don't understand > the motivation for the patch. > There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only > two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. > If there another vlan header inside the packet, it's AD. > So you can do the filtering already without adding new bpf extension... I presume he's referring to https://github.com/the-tcpdump-group/libpcap/issues/397 or https://github.com/the-tcpdump-group/libpcap/issues/390 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/4/15 1:14 PM, Guy Harris wrote: > > On Mar 4, 2015, at 1:03 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: > >> the patch is correct and looks clean, but I don't understand >> the motivation for the patch. >> There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only >> two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. >> If there another vlan header inside the packet, it's AD. >> So you can do the filtering already without adding new bpf extension... > > I presume he's referring to > > https://github.com/the-tcpdump-group/libpcap/issues/397 > > or > > https://github.com/the-tcpdump-group/libpcap/issues/390 ok. context is clear. yet, it still sounds like something to fix inside libpcap. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015/03/05 6:03, Alexei Starovoitov wrote: > On 3/4/15 12:41 PM, Michal Sekletar wrote: >> This commit introduces new BPF extension. It makes possible to load >> value of >> skb->vlan_proto (vlan tpid) to register A. >> >> Currently, vlan header is removed from frame and information is >> available to >> userspace only via tpacket interface. Hence, it is not possible to >> install >> filter which uses value of vlan tpid field. >> >> AFAICT only way how to filter based on tpid value is to reconstruct >> original >> frame encapsulation and interpret BPF filter code in userspace. Doing >> that is >> way slower than doing filtering in kernel. >> >> Cc: Alexei Starovoitov <ast@plumgrid.com> >> Cc: Jiri Pirko <jpirko@redhat.com> >> Signed-off-by: Michal Sekletar <msekleta@redhat.com> >> --- >> @@ -282,6 +282,7 @@ Possible BPF extensions are shown in the following >> table: >> vlan_tci skb_vlan_tag_get(skb) >> vlan_pr skb_vlan_tag_present(skb) >> rand prandom_u32() >> + vlan_proto skb->vlan_proto > > the patch is correct and looks clean, but I don't understand > the motivation for the patch. > There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only > two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. > If there another vlan header inside the packet, it's AD. 802.1ad can be used without an inner vlan tag. Also, we can send/receive QinQ frames with 802.1Q outer tags. (We can create vlan interface on 802.1Q vlan interface.) Thanks, Toshiaki Makita -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/4/15 6:28 PM, Toshiaki Makita wrote: > On 2015/03/05 6:03, Alexei Starovoitov wrote: >> On 3/4/15 12:41 PM, Michal Sekletar wrote: >>> This commit introduces new BPF extension. It makes possible to load >>> value of >>> skb->vlan_proto (vlan tpid) to register A. >>> >>> Currently, vlan header is removed from frame and information is >>> available to >>> userspace only via tpacket interface. Hence, it is not possible to >>> install >>> filter which uses value of vlan tpid field. >>> >>> AFAICT only way how to filter based on tpid value is to reconstruct >>> original >>> frame encapsulation and interpret BPF filter code in userspace. Doing >>> that is >>> way slower than doing filtering in kernel. >>> >>> Cc: Alexei Starovoitov <ast@plumgrid.com> >>> Cc: Jiri Pirko <jpirko@redhat.com> >>> Signed-off-by: Michal Sekletar <msekleta@redhat.com> >>> --- >>> @@ -282,6 +282,7 @@ Possible BPF extensions are shown in the following >>> table: >>> vlan_tci skb_vlan_tag_get(skb) >>> vlan_pr skb_vlan_tag_present(skb) >>> rand prandom_u32() >>> + vlan_proto skb->vlan_proto >> >> the patch is correct and looks clean, but I don't understand >> the motivation for the patch. >> There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only >> two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. >> If there another vlan header inside the packet, it's AD. > > 802.1ad can be used without an inner vlan tag. > Also, we can send/receive QinQ frames with 802.1Q outer tags. > (We can create vlan interface on 802.1Q vlan interface.) yes, but I think existing 'vlan_tag_present' would be enough to address the issue mentioned in two bugs reports. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thu, Mar 05, 2015 at 12:47:06AM CET, ast@plumgrid.com wrote: >On 3/4/15 1:14 PM, Guy Harris wrote: >> >>On Mar 4, 2015, at 1:03 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >> >>>the patch is correct and looks clean, but I don't understand >>>the motivation for the patch. >>>There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only >>>two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. >>>If there another vlan header inside the packet, it's AD. >>>So you can do the filtering already without adding new bpf extension... >> >>I presume he's referring to >> >> https://github.com/the-tcpdump-group/libpcap/issues/397 >> >>or >> >> https://github.com/the-tcpdump-group/libpcap/issues/390 > >ok. context is clear. >yet, it still sounds like something to fix inside libpcap. Libpcap need to somehow let kernel now what vlan proto it should filter on. Also, it is not true that another vlan header inside packet is AD. You can have packet with just AD header (or 2 AD or 2 Q, etc). -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/4/15 10:50 PM, Jiri Pirko wrote: > Thu, Mar 05, 2015 at 12:47:06AM CET, ast@plumgrid.com wrote: >> On 3/4/15 1:14 PM, Guy Harris wrote: >>> >>> On Mar 4, 2015, at 1:03 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>> >>>> the patch is correct and looks clean, but I don't understand >>>> the motivation for the patch. >>>> There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only >>>> two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. >>>> If there another vlan header inside the packet, it's AD. >>>> So you can do the filtering already without adding new bpf extension... >>> >>> I presume he's referring to >>> >>> https://github.com/the-tcpdump-group/libpcap/issues/397 >>> >>> or >>> >>> https://github.com/the-tcpdump-group/libpcap/issues/390 >> >> ok. context is clear. >> yet, it still sounds like something to fix inside libpcap. > > Libpcap need to somehow let kernel now what vlan proto it should filter on. I don't think that's what bug report talking about. It's looking to match on vlan id regardless of vlan_proto. Only vlan_tag_present is needed to make it work. > Also, it is not true that another vlan header inside packet is AD. You > can have packet with just AD header (or 2 AD or 2 Q, etc). yes. Correct, but then again it doesn't seem to be the goal of the bug report. To make myself clear. I think the patch is ok, I'm only asking for clear justification based on real use case and not invented one. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 05, 2015 at 07:50:53AM +0100, Jiri Pirko wrote: > Thu, Mar 05, 2015 at 12:47:06AM CET, ast@plumgrid.com wrote: > >On 3/4/15 1:14 PM, Guy Harris wrote: > >> > >>On Mar 4, 2015, at 1:03 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: > >> > >>>the patch is correct and looks clean, but I don't understand > >>>the motivation for the patch. > >>>There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only > >>>two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. > >>>If there another vlan header inside the packet, it's AD. > >>>So you can do the filtering already without adding new bpf extension... > >> > >>I presume he's referring to > >> > >> https://github.com/the-tcpdump-group/libpcap/issues/397 > >> > >>or > >> > >> https://github.com/the-tcpdump-group/libpcap/issues/390 > > > >ok. context is clear. > >yet, it still sounds like something to fix inside libpcap. > > Libpcap need to somehow let kernel now what vlan proto it should filter on. To be more precise, it does not need it now as there is no syntax for pcap filter on TPID, one can only filter by VID. But if someone wanted to implement such feature, it could not work with in-kernel filtering at the moment. Michal Kubecek -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/4/15 11:24 PM, Michal Kubecek wrote: > On Thu, Mar 05, 2015 at 07:50:53AM +0100, Jiri Pirko wrote: >> Thu, Mar 05, 2015 at 12:47:06AM CET, ast@plumgrid.com wrote: >>> On 3/4/15 1:14 PM, Guy Harris wrote: >>>> >>>> On Mar 4, 2015, at 1:03 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>> >>>>> the patch is correct and looks clean, but I don't understand >>>>> the motivation for the patch. >>>>> There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only >>>>> two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. >>>>> If there another vlan header inside the packet, it's AD. >>>>> So you can do the filtering already without adding new bpf extension... >>>> >>>> I presume he's referring to >>>> >>>> https://github.com/the-tcpdump-group/libpcap/issues/397 >>>> >>>> or >>>> >>>> https://github.com/the-tcpdump-group/libpcap/issues/390 >>> >>> ok. context is clear. >>> yet, it still sounds like something to fix inside libpcap. >> >> Libpcap need to somehow let kernel now what vlan proto it should filter on. > > To be more precise, it does not need it now as there is no syntax for > pcap filter on TPID, one can only filter by VID. But if someone wanted > to implement such feature, it could not work with in-kernel filtering at > the moment. exactly my point. That's how we should evaluate the patch. It does provide extra visibility for classic bpf programs and it's not strongly required at the moment, but can be useful, no doubt. At the same time it forever exposes skb->vlan_proto to user space, so any refactoring of sk_buff would need to deal with that. With the use case presented I don't have strong opinion one way or the other. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mar 4, 2015, at 11:24 PM, Michal Kubecek <mkubecek@suse.cz> wrote: > To be more precise, it does not need it now as there is no syntax for > pcap filter on TPID, In the current version of libpcap (and the commit dates back to November 2011), the filter "vlan" checks both for 0x8100 and 0x9100 (it should perhaps also check for other TPID values), in userland and in the kernel on platforms other than Linux, as the generated code checks for both values. If there is a way that an in-kernel BPF program can check for all VLAN TPID values, that would be helpful; that would be sufficient to implement the filter "vlan" in a fashion that works the same in the Linux kernel and in other contexts. (I presume that one can, from the data delivered to a PF_PACKET SOCK_RAW socket, reconstruct the packet as received, with all the VLAN tags in place in the packet just as they were when the packet hit the adapter's transceiver from the network. If not, that's a *separate* deficiency.) I'm not currently worried about Backbone Service Instance Tag Control Information right now, as libpcap currently doesn't handle that, so there's no need to, for the generated code for "vlan", actually test what particular TPID value is in the packet, as long as we can test whether there's a VLAN header or not. In the future, that might be useful. (BTW, why did the IEEE remove 802.1Q from the IEEE Get program for IEEE 802 standards? http://standards.ieee.org/about/get/802/802.1.html Are they waiting for 802.1Q-2014 to be published? http://www.ieee802.org/1/pages/802.1Q-2014.html And what's the current status of 802.1ad? Is QinQ deprecated or something?)-- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Wed, Mar 04, 2015 at 09:41:58PM CET, msekleta@redhat.com wrote: >This commit introduces new BPF extension. It makes possible to load value of >skb->vlan_proto (vlan tpid) to register A. > >Currently, vlan header is removed from frame and information is available to >userspace only via tpacket interface. Hence, it is not possible to install >filter which uses value of vlan tpid field. > >AFAICT only way how to filter based on tpid value is to reconstruct original >frame encapsulation and interpret BPF filter code in userspace. Doing that is >way slower than doing filtering in kernel. > >Cc: Alexei Starovoitov <ast@plumgrid.com> >Cc: Jiri Pirko <jpirko@redhat.com> >Signed-off-by: Michal Sekletar <msekleta@redhat.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/4/15, Michal Sekletar <msekleta@redhat.com> wrote: > This commit introduces new BPF extension. It makes possible to load value > of > skb->vlan_proto (vlan tpid) to register A. > > Currently, vlan header is removed from frame and information is available > to > userspace only via tpacket interface. Hence, it is not possible to install > filter which uses value of vlan tpid field. > > AFAICT only way how to filter based on tpid value is to reconstruct > original > frame encapsulation and interpret BPF filter code in userspace. Doing that > is > way slower than doing filtering in kernel. > > Cc: Alexei Starovoitov <ast@plumgrid.com> > Cc: Jiri Pirko <jpirko@redhat.com> > Signed-off-by: Michal Sekletar <msekleta@redhat.com> > --- > Documentation/networking/filter.txt | 1 + > arch/arm/net/bpf_jit_32.c | 6 ++++++ > arch/mips/net/bpf_jit.c | 6 ++++++ > arch/powerpc/net/bpf_jit_comp.c | 5 +++++ > arch/s390/net/bpf_jit_comp.c | 5 +++++ > arch/sparc/net/bpf_jit_comp.c | 3 +++ > include/linux/filter.h | 1 + > include/uapi/linux/filter.h | 3 ++- > net/core/filter.c | 10 ++++++++++ > tools/net/bpf_exp.l | 1 + > tools/net/bpf_exp.y | 11 ++++++++++- > 11 files changed, 50 insertions(+), 2 deletions(-) > > diff --git a/Documentation/networking/filter.txt > b/Documentation/networking/filter.txt > index 9930ecfb..3155c4c 100644 > --- a/Documentation/networking/filter.txt > +++ b/Documentation/networking/filter.txt > @@ -282,6 +282,7 @@ Possible BPF extensions are shown in the following > table: > vlan_tci skb_vlan_tag_get(skb) > vlan_pr skb_vlan_tag_present(skb) > rand prandom_u32() > + vlan_proto skb->vlan_proto > > These extensions can also be prefixed with '#'. > Examples for low-level BPF: > diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c > index e1268f9..2954e05 100644 > --- a/arch/arm/net/bpf_jit_32.c > +++ b/arch/arm/net/bpf_jit_32.c > @@ -843,6 +843,12 @@ b_epilogue: > else > OP_IMM3(ARM_AND, r_A, r_A, VLAN_TAG_PRESENT, ctx); > break; > + case BPF_ANC | SKF_AD_VLAN_PROTO: > + ctx->seen |= SEEN_SKB; > + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); > + off = offsetof(struct sk_buff, vlan_proto); > + emit(ARM_LDR_I(r_A, r_skb, off), ctx); > + break; > case BPF_ANC | SKF_AD_QUEUE: > ctx->seen |= SEEN_SKB; > BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, > diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c > index 5d61393..732fb1d 100644 > --- a/arch/mips/net/bpf_jit.c > +++ b/arch/mips/net/bpf_jit.c > @@ -1295,6 +1295,12 @@ jmp_cmp: > emit_sltu(r_A, r_zero, r_A, ctx); > } > break; > + case BPF_ANC | SKF_AD_VLAN_PROTO: > + ctx->flags |= SEEN_SKB | SEEN_A; > + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); > + off = offsetof(struct sk_buff, vlan_proto); > + emit_load(r_A, r_skb, off, ctx); > + break; > case BPF_ANC | SKF_AD_PKTTYPE: > ctx->flags |= SEEN_SKB; > > diff --git a/arch/powerpc/net/bpf_jit_comp.c > b/arch/powerpc/net/bpf_jit_comp.c > index 17cea18..acfa732 100644 > --- a/arch/powerpc/net/bpf_jit_comp.c > +++ b/arch/powerpc/net/bpf_jit_comp.c > @@ -399,6 +399,11 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 > *image, > PPC_SRWI(r_A, r_A, 12); > } > break; > + case BPF_ANC | SKF_AD_VLAN_PROTO: > + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); > + PPC_LWZ_OFFS(r_A, r_skb, offsetof(struct sk_buff, > + vlan_proto)); You're going to load a halfword, so lhz has to be used > + break; > case BPF_ANC | SKF_AD_QUEUE: > BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, > queue_mapping) != 2); > diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c > index bbd1981..55b6db7 100644 > --- a/arch/s390/net/bpf_jit_comp.c > +++ b/arch/s390/net/bpf_jit_comp.c > @@ -722,6 +722,11 @@ call_fn: /* lg %r1,<d(function)>(%r13) */ > EMIT4_DISP(0x88500000, 12); > } > break; > + case BPF_ANC | SKF_AD_VLAN_PROTO: /* A = skb->vlan_proto */ > + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); > + /* l %r5,<d(vlan_proto)>(%r2) */ > + EMIT4_DISP(0x58502000, offsetof(struct sk_buff, vlan_proto)); > + break; > case BPF_ANC | SKF_AD_PKTTYPE: > /* lhi %r5,0 */ > EMIT4(0xa7580000); > diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c > index 7931eee..cf3e6ac 100644 > --- a/arch/sparc/net/bpf_jit_comp.c > +++ b/arch/sparc/net/bpf_jit_comp.c > @@ -624,6 +624,9 @@ void bpf_jit_compile(struct bpf_prog *fp) > emit_and(r_A, r_TMP, r_A); > } > break; > + case BPF_ANC | SKF_AD_VLAN_PROTO: > + emit_skb_load32(vlan_proto, r_A); > + break; > case BPF_LD | BPF_W | BPF_LEN: > emit_skb_load32(len, r_A); > break; > diff --git a/include/linux/filter.h b/include/linux/filter.h > index 9ee8c67..3ec42a1 100644 > --- a/include/linux/filter.h > +++ b/include/linux/filter.h > @@ -454,6 +454,7 @@ static inline u16 bpf_anc_helper(const struct > sock_filter *ftest) > BPF_ANCILLARY(VLAN_TAG_PRESENT); > BPF_ANCILLARY(PAY_OFFSET); > BPF_ANCILLARY(RANDOM); > + BPF_ANCILLARY(VLAN_PROTO); > } > /* Fallthrough. */ > default: > diff --git a/include/uapi/linux/filter.h b/include/uapi/linux/filter.h > index 47785d5..aef103bf 100644 > --- a/include/uapi/linux/filter.h > +++ b/include/uapi/linux/filter.h > @@ -77,7 +77,8 @@ struct sock_fprog { /* Required for SO_ATTACH_FILTER. */ > #define SKF_AD_VLAN_TAG_PRESENT 48 > #define SKF_AD_PAY_OFFSET 52 > #define SKF_AD_RANDOM 56 > -#define SKF_AD_MAX 60 > +#define SKF_AD_VLAN_PROTO 60 > +#define SKF_AD_MAX 64 > #define SKF_NET_OFF (-0x100000) > #define SKF_LL_OFF (-0x200000) > > diff --git a/net/core/filter.c b/net/core/filter.c > index 7a4eb70..b14cc40 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -236,6 +236,16 @@ static bool convert_bpf_extensions(struct sock_filter > *fp, > } > break; > > + case SKF_AD_OFF + SKF_AD_VLAN_PROTO: > + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); > + > + /* A = *(u16 *) (CTX + offsetof(vlan_proto)) */ > + *insn++ = BPF_LDX_MEM(BPF_H, BPF_REG_A, BPF_REG_CTX, > + offsetof(struct sk_buff, vlan_proto)); > + /* A = ntohs(A) [emitting a nop or swap16] */ > + *insn = BPF_ENDIAN(BPF_FROM_BE, BPF_REG_A, 16); > + break; > + > case SKF_AD_OFF + SKF_AD_PAY_OFFSET: > case SKF_AD_OFF + SKF_AD_NLATTR: > case SKF_AD_OFF + SKF_AD_NLATTR_NEST: > diff --git a/tools/net/bpf_exp.l b/tools/net/bpf_exp.l > index 833a966..0d4e72f 100644 > --- a/tools/net/bpf_exp.l > +++ b/tools/net/bpf_exp.l > @@ -93,6 +93,7 @@ extern void yyerror(const char *str); > "#"?("vlan_tci") { return K_VLANT; } > "#"?("vlan_pr") { return K_VLANP; } > "#"?("rand") { return K_RAND; } > +"#"?("vlan_proto") { return K_VLANPR; } > > ":" { return ':'; } > "," { return ','; } > diff --git a/tools/net/bpf_exp.y b/tools/net/bpf_exp.y > index e6306c5..9c7a0e7 100644 > --- a/tools/net/bpf_exp.y > +++ b/tools/net/bpf_exp.y > @@ -56,7 +56,7 @@ static void bpf_set_jmp_label(char *label, enum jmp_type > type); > %token OP_LDXI > > %token K_PKT_LEN K_PROTO K_TYPE K_NLATTR K_NLATTR_NEST K_MARK K_QUEUE > K_HATYPE > -%token K_RXHASH K_CPU K_IFIDX K_VLANT K_VLANP K_POFF K_RAND > +%token K_RXHASH K_CPU K_IFIDX K_VLANT K_VLANP K_POFF K_RAND K_VLANPR > > %token ':' ',' '[' ']' '(' ')' 'x' 'a' '+' 'M' '*' '&' '#' '%' > > @@ -167,6 +167,9 @@ ldb > | OP_LDB K_RAND { > bpf_set_curr_instr(BPF_LD | BPF_B | BPF_ABS, 0, 0, > SKF_AD_OFF + SKF_AD_RANDOM); } > + | OP_LDB K_VLANPR { > + bpf_set_curr_instr(BPF_LD | BPF_B | BPF_ABS, 0, 0, > + SKF_AD_OFF + SKF_AD_VLAN_PROTO); } > ; > > ldh > @@ -218,6 +221,9 @@ ldh > | OP_LDH K_RAND { > bpf_set_curr_instr(BPF_LD | BPF_H | BPF_ABS, 0, 0, > SKF_AD_OFF + SKF_AD_RANDOM); } > + | OP_LDH K_VLANPR { > + bpf_set_curr_instr(BPF_LD | BPF_H | BPF_ABS, 0, 0, > + SKF_AD_OFF + SKF_AD_VLAN_PROTO); } > ; > > ldi > @@ -274,6 +280,9 @@ ld > | OP_LD K_RAND { > bpf_set_curr_instr(BPF_LD | BPF_W | BPF_ABS, 0, 0, > SKF_AD_OFF + SKF_AD_RANDOM); } > + | OP_LD K_VLANPR { > + bpf_set_curr_instr(BPF_LD | BPF_W | BPF_ABS, 0, 0, > + SKF_AD_OFF + SKF_AD_VLAN_PROTO); } > | OP_LD 'M' '[' number ']' { > bpf_set_curr_instr(BPF_LD | BPF_MEM, 0, 0, $4); } > | OP_LD '[' 'x' '+' number ']' { > -- > 1.8.3.1 > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 05, 2015 at 12:35:01AM -0800, Guy Harris wrote: > > On Mar 4, 2015, at 11:24 PM, Michal Kubecek <mkubecek@suse.cz> wrote: > > > To be more precise, it does not need it now as there is no syntax for > > pcap filter on TPID, > > In the current version of libpcap (and the commit dates back to November > 2011), the filter "vlan" checks both for 0x8100 and 0x9100 Actually, there are four different pieces of code: - userspace checking packet contents - userspace checking packet metadata - kernel BPF checking packet contents - kernel BPF checking packet metadata As far as I can see, only the BPF generated to check packet contents does check TPID value, BPF checking metadata does not (neither in kernel nor in userspace). > (it should perhaps also check for other TPID values), in userland and > in the kernel on platforms other than Linux, as the generated code > checks for both values. ... > (I presume that one can, from the data delivered to a PF_PACKET > SOCK_RAW socket, reconstruct the packet as received, with all the VLAN > tags in place in the packet just as they were when the packet hit the > adapter's transceiver from the network. If not, that's a *separate* > deficiency.) These two issues were addressed by https://github.com/the-tcpdump-group/libpcap/pull/351 but the code changed since then so I'll have to check the changes and create a new pull request. Michal Kubecek -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 04, 2015 at 01:03:50PM -0800, Alexei Starovoitov wrote: > On 3/4/15 12:41 PM, Michal Sekletar wrote: > >This commit introduces new BPF extension. It makes possible to load value of > >skb->vlan_proto (vlan tpid) to register A. > > > >Currently, vlan header is removed from frame and information is available to > >userspace only via tpacket interface. Hence, it is not possible to install > >filter which uses value of vlan tpid field. > > > >AFAICT only way how to filter based on tpid value is to reconstruct original > >frame encapsulation and interpret BPF filter code in userspace. Doing that is > >way slower than doing filtering in kernel. > > > >Cc: Alexei Starovoitov <ast@plumgrid.com> > >Cc: Jiri Pirko <jpirko@redhat.com> > >Signed-off-by: Michal Sekletar <msekleta@redhat.com> > >--- > >@@ -282,6 +282,7 @@ Possible BPF extensions are shown in the following table: > > vlan_tci skb_vlan_tag_get(skb) > > vlan_pr skb_vlan_tag_present(skb) > > rand prandom_u32() > >+ vlan_proto skb->vlan_proto > > the patch is correct and looks clean, but I don't understand > the motivation for the patch. Way how libpcap currently uses BPF extensions is not compatible with old behavior where actual value of tpid field was checked. I wanted to address that, i.e. if "vlan" keyword is used as filter expression, libpcap should install a filter such that only ethernet frames having tpid value of 0x8100 or 0x9100 will pass. That is not the case with current libpcap git and 4.0-rc1 kernel. Given that I broke libpcap as described above I tried to come up with the way how to fix that. However I realized that with recent kernels there is no other way than adding new BPF extension. > There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only > two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. Any reason why ETH_P_QINQ1, ETH_P_QINQ2, ETH_P_QINQ3 no longer works? If I understand correctly, you are basically saying, that there is no point checking for vlan tpid because PF_PACKET socket will never receive frame having other tpid value than above two anyway. So bottom line is that I wanted to grant userspace programs more flexibility, and you are saying that it is pointless because for example if outer tpid is 0x9100 socket will never receive the frame. If that is the case then disregard the patch. > If there another vlan header inside the packet, it's AD. > So you can do the filtering already without adding new bpf extension... Cheers, Michal -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 05, 2015 at 11:57:05AM +0300, Denis Kirjanov wrote: > On 3/4/15, Michal Sekletar <msekleta@redhat.com> wrote: > > This commit introduces new BPF extension. It makes possible to load value > > of > > skb->vlan_proto (vlan tpid) to register A. > > > > Currently, vlan header is removed from frame and information is available > > to > > userspace only via tpacket interface. Hence, it is not possible to install > > filter which uses value of vlan tpid field. > > > > AFAICT only way how to filter based on tpid value is to reconstruct > > original > > frame encapsulation and interpret BPF filter code in userspace. Doing that > > is > > way slower than doing filtering in kernel. > > > > Cc: Alexei Starovoitov <ast@plumgrid.com> > > Cc: Jiri Pirko <jpirko@redhat.com> > > Signed-off-by: Michal Sekletar <msekleta@redhat.com> > > --- > > Documentation/networking/filter.txt | 1 + > > arch/arm/net/bpf_jit_32.c | 6 ++++++ > > arch/mips/net/bpf_jit.c | 6 ++++++ > > arch/powerpc/net/bpf_jit_comp.c | 5 +++++ > > arch/s390/net/bpf_jit_comp.c | 5 +++++ > > arch/sparc/net/bpf_jit_comp.c | 3 +++ > > include/linux/filter.h | 1 + > > include/uapi/linux/filter.h | 3 ++- > > net/core/filter.c | 10 ++++++++++ > > tools/net/bpf_exp.l | 1 + > > tools/net/bpf_exp.y | 11 ++++++++++- > > 11 files changed, 50 insertions(+), 2 deletions(-) > > > > diff --git a/Documentation/networking/filter.txt > > b/Documentation/networking/filter.txt > > index 9930ecfb..3155c4c 100644 > > --- a/Documentation/networking/filter.txt > > +++ b/Documentation/networking/filter.txt > > @@ -282,6 +282,7 @@ Possible BPF extensions are shown in the following > > table: > > vlan_tci skb_vlan_tag_get(skb) > > vlan_pr skb_vlan_tag_present(skb) > > rand prandom_u32() > > + vlan_proto skb->vlan_proto > > > > These extensions can also be prefixed with '#'. > > Examples for low-level BPF: > > diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c > > index e1268f9..2954e05 100644 > > --- a/arch/arm/net/bpf_jit_32.c > > +++ b/arch/arm/net/bpf_jit_32.c > > @@ -843,6 +843,12 @@ b_epilogue: > > else > > OP_IMM3(ARM_AND, r_A, r_A, VLAN_TAG_PRESENT, ctx); > > break; > > + case BPF_ANC | SKF_AD_VLAN_PROTO: > > + ctx->seen |= SEEN_SKB; > > + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); > > + off = offsetof(struct sk_buff, vlan_proto); > > + emit(ARM_LDR_I(r_A, r_skb, off), ctx); > > + break; > > case BPF_ANC | SKF_AD_QUEUE: > > ctx->seen |= SEEN_SKB; > > BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, > > diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c > > index 5d61393..732fb1d 100644 > > --- a/arch/mips/net/bpf_jit.c > > +++ b/arch/mips/net/bpf_jit.c > > @@ -1295,6 +1295,12 @@ jmp_cmp: > > emit_sltu(r_A, r_zero, r_A, ctx); > > } > > break; > > + case BPF_ANC | SKF_AD_VLAN_PROTO: > > + ctx->flags |= SEEN_SKB | SEEN_A; > > + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); > > + off = offsetof(struct sk_buff, vlan_proto); > > + emit_load(r_A, r_skb, off, ctx); > > + break; > > case BPF_ANC | SKF_AD_PKTTYPE: > > ctx->flags |= SEEN_SKB; > > > > diff --git a/arch/powerpc/net/bpf_jit_comp.c > > b/arch/powerpc/net/bpf_jit_comp.c > > index 17cea18..acfa732 100644 > > --- a/arch/powerpc/net/bpf_jit_comp.c > > +++ b/arch/powerpc/net/bpf_jit_comp.c > > @@ -399,6 +399,11 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 > > *image, > > PPC_SRWI(r_A, r_A, 12); > > } > > break; > > + case BPF_ANC | SKF_AD_VLAN_PROTO: > > + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); > > + PPC_LWZ_OFFS(r_A, r_skb, offsetof(struct sk_buff, > > + vlan_proto)); > You're going to load a halfword, so lhz has to be used I treated skb->vlan_proto as word instead of halfword on other arches too. I fixed them all now, but I will not send out v2 just yet. First, I'd like to hear back from Alexei. > > > + break; > > case BPF_ANC | SKF_AD_QUEUE: > > BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, > > queue_mapping) != 2); > > diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c > > index bbd1981..55b6db7 100644 > > --- a/arch/s390/net/bpf_jit_comp.c > > +++ b/arch/s390/net/bpf_jit_comp.c > > @@ -722,6 +722,11 @@ call_fn: /* lg %r1,<d(function)>(%r13) */ > > EMIT4_DISP(0x88500000, 12); > > } > > break; > > + case BPF_ANC | SKF_AD_VLAN_PROTO: /* A = skb->vlan_proto */ > > + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); > > + /* l %r5,<d(vlan_proto)>(%r2) */ > > + EMIT4_DISP(0x58502000, offsetof(struct sk_buff, vlan_proto)); > > + break; > > case BPF_ANC | SKF_AD_PKTTYPE: > > /* lhi %r5,0 */ > > EMIT4(0xa7580000); > > diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c > > index 7931eee..cf3e6ac 100644 > > --- a/arch/sparc/net/bpf_jit_comp.c > > +++ b/arch/sparc/net/bpf_jit_comp.c > > @@ -624,6 +624,9 @@ void bpf_jit_compile(struct bpf_prog *fp) > > emit_and(r_A, r_TMP, r_A); > > } > > break; > > + case BPF_ANC | SKF_AD_VLAN_PROTO: > > + emit_skb_load32(vlan_proto, r_A); > > + break; > > case BPF_LD | BPF_W | BPF_LEN: > > emit_skb_load32(len, r_A); > > break; > > diff --git a/include/linux/filter.h b/include/linux/filter.h > > index 9ee8c67..3ec42a1 100644 > > --- a/include/linux/filter.h > > +++ b/include/linux/filter.h > > @@ -454,6 +454,7 @@ static inline u16 bpf_anc_helper(const struct > > sock_filter *ftest) > > BPF_ANCILLARY(VLAN_TAG_PRESENT); > > BPF_ANCILLARY(PAY_OFFSET); > > BPF_ANCILLARY(RANDOM); > > + BPF_ANCILLARY(VLAN_PROTO); > > } > > /* Fallthrough. */ > > default: > > diff --git a/include/uapi/linux/filter.h b/include/uapi/linux/filter.h > > index 47785d5..aef103bf 100644 > > --- a/include/uapi/linux/filter.h > > +++ b/include/uapi/linux/filter.h > > @@ -77,7 +77,8 @@ struct sock_fprog { /* Required for SO_ATTACH_FILTER. */ > > #define SKF_AD_VLAN_TAG_PRESENT 48 > > #define SKF_AD_PAY_OFFSET 52 > > #define SKF_AD_RANDOM 56 > > -#define SKF_AD_MAX 60 > > +#define SKF_AD_VLAN_PROTO 60 > > +#define SKF_AD_MAX 64 > > #define SKF_NET_OFF (-0x100000) > > #define SKF_LL_OFF (-0x200000) > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > index 7a4eb70..b14cc40 100644 > > --- a/net/core/filter.c > > +++ b/net/core/filter.c > > @@ -236,6 +236,16 @@ static bool convert_bpf_extensions(struct sock_filter > > *fp, > > } > > break; > > > > + case SKF_AD_OFF + SKF_AD_VLAN_PROTO: > > + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); > > + > > + /* A = *(u16 *) (CTX + offsetof(vlan_proto)) */ > > + *insn++ = BPF_LDX_MEM(BPF_H, BPF_REG_A, BPF_REG_CTX, > > + offsetof(struct sk_buff, vlan_proto)); > > + /* A = ntohs(A) [emitting a nop or swap16] */ > > + *insn = BPF_ENDIAN(BPF_FROM_BE, BPF_REG_A, 16); > > + break; > > + > > case SKF_AD_OFF + SKF_AD_PAY_OFFSET: > > case SKF_AD_OFF + SKF_AD_NLATTR: > > case SKF_AD_OFF + SKF_AD_NLATTR_NEST: > > diff --git a/tools/net/bpf_exp.l b/tools/net/bpf_exp.l > > index 833a966..0d4e72f 100644 > > --- a/tools/net/bpf_exp.l > > +++ b/tools/net/bpf_exp.l > > @@ -93,6 +93,7 @@ extern void yyerror(const char *str); > > "#"?("vlan_tci") { return K_VLANT; } > > "#"?("vlan_pr") { return K_VLANP; } > > "#"?("rand") { return K_RAND; } > > +"#"?("vlan_proto") { return K_VLANPR; } > > > > ":" { return ':'; } > > "," { return ','; } > > diff --git a/tools/net/bpf_exp.y b/tools/net/bpf_exp.y > > index e6306c5..9c7a0e7 100644 > > --- a/tools/net/bpf_exp.y > > +++ b/tools/net/bpf_exp.y > > @@ -56,7 +56,7 @@ static void bpf_set_jmp_label(char *label, enum jmp_type > > type); > > %token OP_LDXI > > > > %token K_PKT_LEN K_PROTO K_TYPE K_NLATTR K_NLATTR_NEST K_MARK K_QUEUE > > K_HATYPE > > -%token K_RXHASH K_CPU K_IFIDX K_VLANT K_VLANP K_POFF K_RAND > > +%token K_RXHASH K_CPU K_IFIDX K_VLANT K_VLANP K_POFF K_RAND K_VLANPR > > > > %token ':' ',' '[' ']' '(' ')' 'x' 'a' '+' 'M' '*' '&' '#' '%' > > > > @@ -167,6 +167,9 @@ ldb > > | OP_LDB K_RAND { > > bpf_set_curr_instr(BPF_LD | BPF_B | BPF_ABS, 0, 0, > > SKF_AD_OFF + SKF_AD_RANDOM); } > > + | OP_LDB K_VLANPR { > > + bpf_set_curr_instr(BPF_LD | BPF_B | BPF_ABS, 0, 0, > > + SKF_AD_OFF + SKF_AD_VLAN_PROTO); } > > ; > > > > ldh > > @@ -218,6 +221,9 @@ ldh > > | OP_LDH K_RAND { > > bpf_set_curr_instr(BPF_LD | BPF_H | BPF_ABS, 0, 0, > > SKF_AD_OFF + SKF_AD_RANDOM); } > > + | OP_LDH K_VLANPR { > > + bpf_set_curr_instr(BPF_LD | BPF_H | BPF_ABS, 0, 0, > > + SKF_AD_OFF + SKF_AD_VLAN_PROTO); } > > ; > > > > ldi > > @@ -274,6 +280,9 @@ ld > > | OP_LD K_RAND { > > bpf_set_curr_instr(BPF_LD | BPF_W | BPF_ABS, 0, 0, > > SKF_AD_OFF + SKF_AD_RANDOM); } > > + | OP_LD K_VLANPR { > > + bpf_set_curr_instr(BPF_LD | BPF_W | BPF_ABS, 0, 0, > > + SKF_AD_OFF + SKF_AD_VLAN_PROTO); } > > | OP_LD 'M' '[' number ']' { > > bpf_set_curr_instr(BPF_LD | BPF_MEM, 0, 0, $4); } > > | OP_LD '[' 'x' '+' number ']' { > > -- > > 1.8.3.1 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe netdev" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/5/15 2:37 AM, Michal Sekletar wrote: > On Wed, Mar 04, 2015 at 01:03:50PM -0800, Alexei Starovoitov wrote: >> On 3/4/15 12:41 PM, Michal Sekletar wrote: >>> This commit introduces new BPF extension. It makes possible to load value of >>> skb->vlan_proto (vlan tpid) to register A. >>> >>> Currently, vlan header is removed from frame and information is available to >>> userspace only via tpacket interface. Hence, it is not possible to install >>> filter which uses value of vlan tpid field. >>> >>> AFAICT only way how to filter based on tpid value is to reconstruct original >>> frame encapsulation and interpret BPF filter code in userspace. Doing that is >>> way slower than doing filtering in kernel. >>> >>> Cc: Alexei Starovoitov <ast@plumgrid.com> >>> Cc: Jiri Pirko <jpirko@redhat.com> >>> Signed-off-by: Michal Sekletar <msekleta@redhat.com> >>> --- >>> @@ -282,6 +282,7 @@ Possible BPF extensions are shown in the following table: >>> vlan_tci skb_vlan_tag_get(skb) >>> vlan_pr skb_vlan_tag_present(skb) >>> rand prandom_u32() >>> + vlan_proto skb->vlan_proto >> >> the patch is correct and looks clean, but I don't understand >> the motivation for the patch. > > Way how libpcap currently uses BPF extensions is not compatible with old > behavior where actual value of tpid field was checked. I wanted to address > that, i.e. if "vlan" keyword is used as filter expression, libpcap should > install a filter such that only ethernet frames having tpid value of 0x8100 or > 0x9100 will pass. That is not the case with current libpcap git and 4.0-rc1 > kernel. > > Given that I broke libpcap as described above I tried to come up with the way > how to fix that. However I realized that with recent kernels there is no other > way than adding new BPF extension. > >> There is already SKF_AD_VLAN_TAG_PRESENT. If it is set then only >> two possible values of vlan_proto are ETH_P_8021Q or ETH_P_8021AD. > > Any reason why ETH_P_QINQ1, ETH_P_QINQ2, ETH_P_QINQ3 no longer works? If I > understand correctly, you are basically saying, that there is no point checking > for vlan tpid because PF_PACKET socket will never receive frame having other > tpid value than above two anyway. > > So bottom line is that I wanted to grant userspace programs more flexibility, > and you are saying that it is pointless because for example if outer tpid is > 0x9100 socket will never receive the frame. If that is the case then > disregard the patch. steering towards vlan device happens only for ETH_P_8021Q and ETH_P_8021AD. Non-standard 0x9100 and other tags won't be popped into skb metadata and will stay as-is in the packet body. If the meaning of 'vlan 100' in libpcap is to detect all possible vlan tpid then bpf program would need to check VLAN_TAG_PRESENT (that would mean vlan_proto is either 0x8100 or 0x88A8) and parse packet body for tpids 0x9[123]00. Whether we add access to skb->vlan_proto or not, the program would still need to do the above steps, but instead of checking for vlan_tag_present only, it would need to do vlan_proto==0x8100 or vlan_proto=0x88a8 and then parse the packet for tpid=0x9[123]00 so adding access to vlan_proto will not simplify libpcap job. At this point I think it's up to Dave to decide whether we need this patch (after fixing the issue pointed by Denis) or not. imo there is a benefit of giving programs more visibility into skb metadata. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Denis Kirjanov <kda@linux-powerpc.org> Date: Thu, 5 Mar 2015 11:57:05 +0300 > On 3/4/15, Michal Sekletar <msekleta@redhat.com> wrote: >> --- a/arch/powerpc/net/bpf_jit_comp.c >> +++ b/arch/powerpc/net/bpf_jit_comp.c >> @@ -399,6 +399,11 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 >> *image, >> PPC_SRWI(r_A, r_A, 12); >> } >> break; >> + case BPF_ANC | SKF_AD_VLAN_PROTO: >> + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); >> + PPC_LWZ_OFFS(r_A, r_skb, offsetof(struct sk_buff, >> + vlan_proto)); > You're going to load a halfword, so lhz has to be used > >> + break; Same bug in the sparc implementation too: >> @@ -624,6 +624,9 @@ void bpf_jit_compile(struct bpf_prog *fp) >> emit_and(r_A, r_TMP, r_A); >> } >> break; >> + case BPF_ANC | SKF_AD_VLAN_PROTO: >> + emit_skb_load32(vlan_proto, r_A); >> + break; >> case BPF_LD | BPF_W | BPF_LEN: >> emit_skb_load32(len, r_A); >> break; This needs to use emit_load16(). These JIT changes seem to have been done a little carelessly. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt index 9930ecfb..3155c4c 100644 --- a/Documentation/networking/filter.txt +++ b/Documentation/networking/filter.txt @@ -282,6 +282,7 @@ Possible BPF extensions are shown in the following table: vlan_tci skb_vlan_tag_get(skb) vlan_pr skb_vlan_tag_present(skb) rand prandom_u32() + vlan_proto skb->vlan_proto These extensions can also be prefixed with '#'. Examples for low-level BPF: diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c index e1268f9..2954e05 100644 --- a/arch/arm/net/bpf_jit_32.c +++ b/arch/arm/net/bpf_jit_32.c @@ -843,6 +843,12 @@ b_epilogue: else OP_IMM3(ARM_AND, r_A, r_A, VLAN_TAG_PRESENT, ctx); break; + case BPF_ANC | SKF_AD_VLAN_PROTO: + ctx->seen |= SEEN_SKB; + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); + off = offsetof(struct sk_buff, vlan_proto); + emit(ARM_LDR_I(r_A, r_skb, off), ctx); + break; case BPF_ANC | SKF_AD_QUEUE: ctx->seen |= SEEN_SKB; BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c index 5d61393..732fb1d 100644 --- a/arch/mips/net/bpf_jit.c +++ b/arch/mips/net/bpf_jit.c @@ -1295,6 +1295,12 @@ jmp_cmp: emit_sltu(r_A, r_zero, r_A, ctx); } break; + case BPF_ANC | SKF_AD_VLAN_PROTO: + ctx->flags |= SEEN_SKB | SEEN_A; + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); + off = offsetof(struct sk_buff, vlan_proto); + emit_load(r_A, r_skb, off, ctx); + break; case BPF_ANC | SKF_AD_PKTTYPE: ctx->flags |= SEEN_SKB; diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c index 17cea18..acfa732 100644 --- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c @@ -399,6 +399,11 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, PPC_SRWI(r_A, r_A, 12); } break; + case BPF_ANC | SKF_AD_VLAN_PROTO: + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); + PPC_LWZ_OFFS(r_A, r_skb, offsetof(struct sk_buff, + vlan_proto)); + break; case BPF_ANC | SKF_AD_QUEUE: BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, queue_mapping) != 2); diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index bbd1981..55b6db7 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -722,6 +722,11 @@ call_fn: /* lg %r1,<d(function)>(%r13) */ EMIT4_DISP(0x88500000, 12); } break; + case BPF_ANC | SKF_AD_VLAN_PROTO: /* A = skb->vlan_proto */ + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); + /* l %r5,<d(vlan_proto)>(%r2) */ + EMIT4_DISP(0x58502000, offsetof(struct sk_buff, vlan_proto)); + break; case BPF_ANC | SKF_AD_PKTTYPE: /* lhi %r5,0 */ EMIT4(0xa7580000); diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c index 7931eee..cf3e6ac 100644 --- a/arch/sparc/net/bpf_jit_comp.c +++ b/arch/sparc/net/bpf_jit_comp.c @@ -624,6 +624,9 @@ void bpf_jit_compile(struct bpf_prog *fp) emit_and(r_A, r_TMP, r_A); } break; + case BPF_ANC | SKF_AD_VLAN_PROTO: + emit_skb_load32(vlan_proto, r_A); + break; case BPF_LD | BPF_W | BPF_LEN: emit_skb_load32(len, r_A); break; diff --git a/include/linux/filter.h b/include/linux/filter.h index 9ee8c67..3ec42a1 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -454,6 +454,7 @@ static inline u16 bpf_anc_helper(const struct sock_filter *ftest) BPF_ANCILLARY(VLAN_TAG_PRESENT); BPF_ANCILLARY(PAY_OFFSET); BPF_ANCILLARY(RANDOM); + BPF_ANCILLARY(VLAN_PROTO); } /* Fallthrough. */ default: diff --git a/include/uapi/linux/filter.h b/include/uapi/linux/filter.h index 47785d5..aef103bf 100644 --- a/include/uapi/linux/filter.h +++ b/include/uapi/linux/filter.h @@ -77,7 +77,8 @@ struct sock_fprog { /* Required for SO_ATTACH_FILTER. */ #define SKF_AD_VLAN_TAG_PRESENT 48 #define SKF_AD_PAY_OFFSET 52 #define SKF_AD_RANDOM 56 -#define SKF_AD_MAX 60 +#define SKF_AD_VLAN_PROTO 60 +#define SKF_AD_MAX 64 #define SKF_NET_OFF (-0x100000) #define SKF_LL_OFF (-0x200000) diff --git a/net/core/filter.c b/net/core/filter.c index 7a4eb70..b14cc40 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -236,6 +236,16 @@ static bool convert_bpf_extensions(struct sock_filter *fp, } break; + case SKF_AD_OFF + SKF_AD_VLAN_PROTO: + BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_proto) != 2); + + /* A = *(u16 *) (CTX + offsetof(vlan_proto)) */ + *insn++ = BPF_LDX_MEM(BPF_H, BPF_REG_A, BPF_REG_CTX, + offsetof(struct sk_buff, vlan_proto)); + /* A = ntohs(A) [emitting a nop or swap16] */ + *insn = BPF_ENDIAN(BPF_FROM_BE, BPF_REG_A, 16); + break; + case SKF_AD_OFF + SKF_AD_PAY_OFFSET: case SKF_AD_OFF + SKF_AD_NLATTR: case SKF_AD_OFF + SKF_AD_NLATTR_NEST: diff --git a/tools/net/bpf_exp.l b/tools/net/bpf_exp.l index 833a966..0d4e72f 100644 --- a/tools/net/bpf_exp.l +++ b/tools/net/bpf_exp.l @@ -93,6 +93,7 @@ extern void yyerror(const char *str); "#"?("vlan_tci") { return K_VLANT; } "#"?("vlan_pr") { return K_VLANP; } "#"?("rand") { return K_RAND; } +"#"?("vlan_proto") { return K_VLANPR; } ":" { return ':'; } "," { return ','; } diff --git a/tools/net/bpf_exp.y b/tools/net/bpf_exp.y index e6306c5..9c7a0e7 100644 --- a/tools/net/bpf_exp.y +++ b/tools/net/bpf_exp.y @@ -56,7 +56,7 @@ static void bpf_set_jmp_label(char *label, enum jmp_type type); %token OP_LDXI %token K_PKT_LEN K_PROTO K_TYPE K_NLATTR K_NLATTR_NEST K_MARK K_QUEUE K_HATYPE -%token K_RXHASH K_CPU K_IFIDX K_VLANT K_VLANP K_POFF K_RAND +%token K_RXHASH K_CPU K_IFIDX K_VLANT K_VLANP K_POFF K_RAND K_VLANPR %token ':' ',' '[' ']' '(' ')' 'x' 'a' '+' 'M' '*' '&' '#' '%' @@ -167,6 +167,9 @@ ldb | OP_LDB K_RAND { bpf_set_curr_instr(BPF_LD | BPF_B | BPF_ABS, 0, 0, SKF_AD_OFF + SKF_AD_RANDOM); } + | OP_LDB K_VLANPR { + bpf_set_curr_instr(BPF_LD | BPF_B | BPF_ABS, 0, 0, + SKF_AD_OFF + SKF_AD_VLAN_PROTO); } ; ldh @@ -218,6 +221,9 @@ ldh | OP_LDH K_RAND { bpf_set_curr_instr(BPF_LD | BPF_H | BPF_ABS, 0, 0, SKF_AD_OFF + SKF_AD_RANDOM); } + | OP_LDH K_VLANPR { + bpf_set_curr_instr(BPF_LD | BPF_H | BPF_ABS, 0, 0, + SKF_AD_OFF + SKF_AD_VLAN_PROTO); } ; ldi @@ -274,6 +280,9 @@ ld | OP_LD K_RAND { bpf_set_curr_instr(BPF_LD | BPF_W | BPF_ABS, 0, 0, SKF_AD_OFF + SKF_AD_RANDOM); } + | OP_LD K_VLANPR { + bpf_set_curr_instr(BPF_LD | BPF_W | BPF_ABS, 0, 0, + SKF_AD_OFF + SKF_AD_VLAN_PROTO); } | OP_LD 'M' '[' number ']' { bpf_set_curr_instr(BPF_LD | BPF_MEM, 0, 0, $4); } | OP_LD '[' 'x' '+' number ']' {
This commit introduces new BPF extension. It makes possible to load value of skb->vlan_proto (vlan tpid) to register A. Currently, vlan header is removed from frame and information is available to userspace only via tpacket interface. Hence, it is not possible to install filter which uses value of vlan tpid field. AFAICT only way how to filter based on tpid value is to reconstruct original frame encapsulation and interpret BPF filter code in userspace. Doing that is way slower than doing filtering in kernel. Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Michal Sekletar <msekleta@redhat.com> --- Documentation/networking/filter.txt | 1 + arch/arm/net/bpf_jit_32.c | 6 ++++++ arch/mips/net/bpf_jit.c | 6 ++++++ arch/powerpc/net/bpf_jit_comp.c | 5 +++++ arch/s390/net/bpf_jit_comp.c | 5 +++++ arch/sparc/net/bpf_jit_comp.c | 3 +++ include/linux/filter.h | 1 + include/uapi/linux/filter.h | 3 ++- net/core/filter.c | 10 ++++++++++ tools/net/bpf_exp.l | 1 + tools/net/bpf_exp.y | 11 ++++++++++- 11 files changed, 50 insertions(+), 2 deletions(-)