diff mbox series

[RFC] bpf: add bpf_link support for BPF_NETFILTER programs

Message ID 20230130150432.24924-1-fw@strlen.de
State RFC
Delegated to: Pablo Neira
Headers show
Series [RFC] bpf: add bpf_link support for BPF_NETFILTER programs | expand

Commit Message

Florian Westphal Jan. 30, 2023, 3:04 p.m. UTC
Doesn't apply, doesn't work -- there is no BPF_NETFILTER program type.

Sketches the uapi.  Example usage:

	union bpf_attr attr = { };

	attr.link_create.prog_fd = progfd;
	attr.link_create.attach_type = BPF_NETFILTER;
	attr.link_create.netfilter.pf = PF_INET;
	attr.link_create.netfilter.hooknum = NF_INET_LOCAL_IN;
	attr.link_create.netfilter.priority = -128;

	err = bpf(BPF_LINK_CREATE, &attr, sizeof(attr));

... this would attach progfd to ipv4:input hook.

Is BPF_LINK the right place?  Hook gets removed automatically if the calling program
exits, afaict this is intended.

Should a program running in init_netns be allowed to attach hooks in other netns too?

I could do what BPF_LINK_TYPE_NETNS is doing and fetch net via
get_net_ns_by_fd(attr->link_create.target_fd);

For the actual BPF_NETFILTER program type I plan to follow what the bpf
flow dissector is doing, i.e. pretend prototype is

func(struct __sk_buff *skb)

but pass a custom program specific context struct on kernel side.
Verifier will rewrite accesses as needed.

Things like nf_hook_state->in (net_device) could then be exposed via
kfuncs.

nf_hook_run_bpf() (c-function that creates the program context and
calls the real bpf prog) would be "updated" to use the bpf dispatcher to
avoid the indirect call overhead.

Does that seem ok to you?  I'd ignore the bpf dispatcher for now and
would work on the needed verifier changes first.

Thanks.
---
 include/linux/netfilter.h           |   1 +
 include/net/netfilter/nf_hook_bpf.h |   3 +
 include/uapi/linux/bpf.h            |  13 ++++
 kernel/bpf/syscall.c                |   7 ++
 net/netfilter/nf_hook_bpf.c         | 114 ++++++++++++++++++++++++++++
 5 files changed, 138 insertions(+)

Comments

Toke Høiland-Jørgensen Jan. 30, 2023, 5:38 p.m. UTC | #1
Florian Westphal <fw@strlen.de> writes:

> Doesn't apply, doesn't work -- there is no BPF_NETFILTER program type.
>
> Sketches the uapi.  Example usage:
>
> 	union bpf_attr attr = { };
>
> 	attr.link_create.prog_fd = progfd;
> 	attr.link_create.attach_type = BPF_NETFILTER;
> 	attr.link_create.netfilter.pf = PF_INET;
> 	attr.link_create.netfilter.hooknum = NF_INET_LOCAL_IN;
> 	attr.link_create.netfilter.priority = -128;
>
> 	err = bpf(BPF_LINK_CREATE, &attr, sizeof(attr));
>
> ... this would attach progfd to ipv4:input hook.
>
> Is BPF_LINK the right place?  Hook gets removed automatically if the calling program
> exits, afaict this is intended.

Yes, this is indeed intended for bpf_link. This plays well with
applications that use the API and stick around (because things get
cleaned up after them automatically even if they crash, say), but it
doesn't work so well for programs that don't (which, notably, includes
command line utilities like 'nft').

This is why I personally never really liked those semantics for
networking hooks: If I run a utility that attaches an XDP program I
generally expect that to stick around until the netdev disappears unless
something else explicitly removes it. (Yes you can pin a bpf_link, but
then you have the opposite problem: if the netdev disappears some entity
has to remove the pinned link, or you'll have a zombie program present
in the kernel until the next reboot).

For XDP and TC users can choose between bpf_link and netlink for
attachment and get one of the two semantics (goes away on close or stays
put). Not sure if it would make sense to do the same for nftables?

> Should a program running in init_netns be allowed to attach hooks in other netns too?
>
> I could do what BPF_LINK_TYPE_NETNS is doing and fetch net via
> get_net_ns_by_fd(attr->link_create.target_fd);

We don't allow that for any other type of BPF program; the expectation
is that the entity doing the attachment will move to the right ns first.
Is there any particular use case for doing something different for
nftables?

> For the actual BPF_NETFILTER program type I plan to follow what the bpf
> flow dissector is doing, i.e. pretend prototype is
>
> func(struct __sk_buff *skb)
>
> but pass a custom program specific context struct on kernel side.
> Verifier will rewrite accesses as needed.

This sounds reasonable, and also promotes code reuse between program
types (say, you can write some BPF code to parse a packet and reuse it
between the flow dissector, TC and netfilter).

> Things like nf_hook_state->in (net_device) could then be exposed via
> kfuncs.

Right, so like:

state = bpf_nf_get_hook_state(ctx); ?

Sounds OK to me.

> nf_hook_run_bpf() (c-function that creates the program context and
> calls the real bpf prog) would be "updated" to use the bpf dispatcher to
> avoid the indirect call overhead.

What 'bpf dispatcher' are you referring to here? We have way too many
things with that name :P

> Does that seem ok to you?  I'd ignore the bpf dispatcher for now and
> would work on the needed verifier changes first.

Getting something that works first seems reasonable, sure! :)

-Toke
Florian Westphal Jan. 30, 2023, 6:01 p.m. UTC | #2
Toke Høiland-Jørgensen <toke@kernel.org> wrote:
> > Is BPF_LINK the right place?  Hook gets removed automatically if the calling program
> > exits, afaict this is intended.
> 
> Yes, this is indeed intended for bpf_link. This plays well with
> applications that use the API and stick around (because things get
> cleaned up after them automatically even if they crash, say), but it
> doesn't work so well for programs that don't (which, notably, includes
> command line utilities like 'nft').

Right, but I did not want to create a dependency on nfnetlink or
nftables netlink right from the start.

> For XDP and TC users can choose between bpf_link and netlink for
> attachment and get one of the two semantics (goes away on close or stays
> put). Not sure if it would make sense to do the same for nftables?

For nftables I suspect that, if nft can emit bpf, it would make sense to
pass the bpf descriptor together with nftables netlink, i.e. along with
the normal netlink data.

nftables kernel side would then know to use the bpf prog for the
datapath instead of the interpreter and could even fallback to
interpreter.

But for the raw hook use case that Alexei and Daniel preferred (cf.
initial proposal to call bpf progs from nf_tables interpreter) I think
that there should be no nftables dependency.

I could add a new nfnetlink subtype for nf-bpf if bpf_link is not
appropriate as an alternative.

> > Should a program running in init_netns be allowed to attach hooks in other netns too?
> >
> > I could do what BPF_LINK_TYPE_NETNS is doing and fetch net via
> > get_net_ns_by_fd(attr->link_create.target_fd);
> 
> We don't allow that for any other type of BPF program; the expectation
> is that the entity doing the attachment will move to the right ns first.
> Is there any particular use case for doing something different for
> nftables?

Nope, not at all.  I was just curious.  So no special handling for
init_netns needed, good.

> > For the actual BPF_NETFILTER program type I plan to follow what the bpf
> > flow dissector is doing, i.e. pretend prototype is
> >
> > func(struct __sk_buff *skb)
> >
> > but pass a custom program specific context struct on kernel side.
> > Verifier will rewrite accesses as needed.
> 
> This sounds reasonable, and also promotes code reuse between program
> types (say, you can write some BPF code to parse a packet and reuse it
> between the flow dissector, TC and netfilter).

Ok, thanks.

> > Things like nf_hook_state->in (net_device) could then be exposed via
> > kfuncs.
> 
> Right, so like:
> 
> state = bpf_nf_get_hook_state(ctx); ?
> 
> Sounds OK to me.

Yes, something like that.  Downside is that the nf_hook_state struct
is no longer bound by any abi contract, but as I understood it thats
fine.

> > nf_hook_run_bpf() (c-function that creates the program context and
> > calls the real bpf prog) would be "updated" to use the bpf dispatcher to
> > avoid the indirect call overhead.
> 
> What 'bpf dispatcher' are you referring to here? We have way too many
> things with that name :P

I meant 'DEFINE_BPF_DISPATCHER(nf_user_progs);'

Thanks.
Toke Høiland-Jørgensen Jan. 30, 2023, 9:10 p.m. UTC | #3
Florian Westphal <fw@strlen.de> writes:

> Toke Høiland-Jørgensen <toke@kernel.org> wrote:
>> > Is BPF_LINK the right place?  Hook gets removed automatically if the calling program
>> > exits, afaict this is intended.
>> 
>> Yes, this is indeed intended for bpf_link. This plays well with
>> applications that use the API and stick around (because things get
>> cleaned up after them automatically even if they crash, say), but it
>> doesn't work so well for programs that don't (which, notably, includes
>> command line utilities like 'nft').
>
> Right, but I did not want to create a dependency on nfnetlink or
> nftables netlink right from the start.

Dependency how? For userspace consumers, you mean?

>> For XDP and TC users can choose between bpf_link and netlink for
>> attachment and get one of the two semantics (goes away on close or stays
>> put). Not sure if it would make sense to do the same for nftables?
>
> For nftables I suspect that, if nft can emit bpf, it would make sense to
> pass the bpf descriptor together with nftables netlink, i.e. along with
> the normal netlink data.
>
> nftables kernel side would then know to use the bpf prog for the
> datapath instead of the interpreter and could even fallback to
> interpreter.
>
> But for the raw hook use case that Alexei and Daniel preferred (cf.
> initial proposal to call bpf progs from nf_tables interpreter) I think
> that there should be no nftables dependency.
>
> I could add a new nfnetlink subtype for nf-bpf if bpf_link is not
> appropriate as an alternative.

I don't think there's anything wrong with bpf_link as an initial
interface at least. I just think it should (eventually) be possible to
load a BPF-based firewall from the command line via this interface,
without having to resort to pinning. There was some talk about adding
this as a mode to the bpf_link interface itself at some point, but that
never materialised (probably because the need is not great since the
netlink interface serves this purpose for TC/XDP).

>> > Things like nf_hook_state->in (net_device) could then be exposed via
>> > kfuncs.
>> 
>> Right, so like:
>> 
>> state = bpf_nf_get_hook_state(ctx); ?
>> 
>> Sounds OK to me.
>
> Yes, something like that.  Downside is that the nf_hook_state struct
> is no longer bound by any abi contract, but as I understood it thats
> fine.

Well, there's an ongoing discussion about what, if any, should be the
expectations around kfunc stability:

https://lore.kernel.org/r/20230117212731.442859-1-toke@redhat.com

I certainly don't think it's problematic for a subsystem to give *more*
stability guarantees than BPF core. I mean, if you want the kfunc
interface to be stable you just... don't change it? :)

>> > nf_hook_run_bpf() (c-function that creates the program context and
>> > calls the real bpf prog) would be "updated" to use the bpf dispatcher to
>> > avoid the indirect call overhead.
>> 
>> What 'bpf dispatcher' are you referring to here? We have way too many
>> things with that name :P
>
> I meant 'DEFINE_BPF_DISPATCHER(nf_user_progs);'

Ah, right. Yeah, that can definitely be added later!

-Toke
Alexei Starovoitov Jan. 30, 2023, 9:44 p.m. UTC | #4
On Mon, Jan 30, 2023 at 07:01:15PM +0100, Florian Westphal wrote:
> Toke Høiland-Jørgensen <toke@kernel.org> wrote:
> > > Is BPF_LINK the right place?  Hook gets removed automatically if the calling program
> > > exits, afaict this is intended.

Yes. bpf_link is the right model.
I'd also allow more than one BPF_NETFILTER prog at the hook.
When Daniel respins his tc bpf_link set there will be a way to do that
for tc and hopefully soon for xdp.
For netfilter hook we can use the same approach.

> > 
> > Yes, this is indeed intended for bpf_link. This plays well with
> > applications that use the API and stick around (because things get
> > cleaned up after them automatically even if they crash, say), but it
> > doesn't work so well for programs that don't (which, notably, includes
> > command line utilities like 'nft').
> 
> Right, but I did not want to create a dependency on nfnetlink or
> nftables netlink right from the start.
> 
> > For XDP and TC users can choose between bpf_link and netlink for
> > attachment and get one of the two semantics (goes away on close or stays
> > put). Not sure if it would make sense to do the same for nftables?
> 
> For nftables I suspect that, if nft can emit bpf, it would make sense to
> pass the bpf descriptor together with nftables netlink, i.e. along with
> the normal netlink data.
> 
> nftables kernel side would then know to use the bpf prog for the
> datapath instead of the interpreter and could even fallback to
> interpreter.
> 
> But for the raw hook use case that Alexei and Daniel preferred (cf.
> initial proposal to call bpf progs from nf_tables interpreter) I think
> that there should be no nftables dependency.
> 
> I could add a new nfnetlink subtype for nf-bpf if bpf_link is not
> appropriate as an alternative.

Let's start with bpf_link and figure out netlink path when appropriate.

> > > Should a program running in init_netns be allowed to attach hooks in other netns too?
> > >
> > > I could do what BPF_LINK_TYPE_NETNS is doing and fetch net via
> > > get_net_ns_by_fd(attr->link_create.target_fd);
> > 
> > We don't allow that for any other type of BPF program; the expectation
> > is that the entity doing the attachment will move to the right ns first.
> > Is there any particular use case for doing something different for
> > nftables?
> 
> Nope, not at all.  I was just curious.  So no special handling for
> init_netns needed, good.
> 
> > > For the actual BPF_NETFILTER program type I plan to follow what the bpf
> > > flow dissector is doing, i.e. pretend prototype is
> > >
> > > func(struct __sk_buff *skb)
> > >
> > > but pass a custom program specific context struct on kernel side.
> > > Verifier will rewrite accesses as needed.
> > 
> > This sounds reasonable, and also promotes code reuse between program
> > types (say, you can write some BPF code to parse a packet and reuse it
> > between the flow dissector, TC and netfilter).
> 
> Ok, thanks.
> 
> > > Things like nf_hook_state->in (net_device) could then be exposed via
> > > kfuncs.
> > 
> > Right, so like:
> > 
> > state = bpf_nf_get_hook_state(ctx); ?
> > 
> > Sounds OK to me.
> 
> Yes, something like that.  Downside is that the nf_hook_state struct
> is no longer bound by any abi contract, but as I understood it thats
> fine.

I'd steer clear from new abi-s.
Don't look at uapi __sk_buff model. It's not a great example to follow.
Just pass kernel nf_hook_state into bpf prog and let program deal
with changes to it via CORE.
The prog will get a defition of 'struct nf_hook_state' from vmlinux.h
or via private 'struct nf_hook_state___flavor' with few fields defined
that prog wants to use. CORE will deal with offset adjustments.
That's a lot less kernel code. No need for asm style ctx rewrites.
Just see how much kernel code we already burned on *convert_ctx_access().
We cannot remove this tech debt due to uapi.
When you pass struct nf_hook_state directly none of it is needed.
Florian Westphal Jan. 31, 2023, 2:18 p.m. UTC | #5
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> Yes. bpf_link is the right model.
> I'd also allow more than one BPF_NETFILTER prog at the hook.
> When Daniel respins his tc bpf_link set there will be a way to do that
> for tc and hopefully soon for xdp.
> For netfilter hook we can use the same approach.

For nf it should already support several programs, the
builtin limit in the nf core is currently 1024 hooks per
family/hook location.

> > I could add a new nfnetlink subtype for nf-bpf if bpf_link is not
> > appropriate as an alternative.
> 
> Let's start with bpf_link and figure out netlink path when appropriate.

Good, that works for me.

> I'd steer clear from new abi-s.
> Don't look at uapi __sk_buff model. It's not a great example to follow.
> Just pass kernel nf_hook_state into bpf prog and let program deal
> with changes to it via CORE.

The current prototype for nf hooks is

fun(void *private, struct sk_buff *skb, struct nf_hook_state *s)

Originally I had intended to place sk_buff in nf_hook_state, but its
quite some code churn for everyone else.

So I'm leaning towards something like
	struct nf_bpf_ctx {
		struct nf_hook_state *state;
		struct sk_buff *skb;
	};

that gets passed as argument.

> The prog will get a defition of 'struct nf_hook_state' from vmlinux.h
> or via private 'struct nf_hook_state___flavor' with few fields defined
> that prog wants to use. CORE will deal with offset adjustments.
> That's a lot less kernel code. No need for asm style ctx rewrites.
> Just see how much kernel code we already burned on *convert_ctx_access().
> We cannot remove this tech debt due to uapi.
> When you pass struct nf_hook_state directly none of it is needed.

Ok, thanks for pointing that out.  I did not realize
convert_ctx_access() conversions were frowned upon.

I will pass a known/exposed struct then.

I thought __sk_buff was required for direct packet access, I will look
at this again.
Toke Høiland-Jørgensen Jan. 31, 2023, 4:19 p.m. UTC | #6
Florian Westphal <fw@strlen.de> writes:

>> The prog will get a defition of 'struct nf_hook_state' from vmlinux.h
>> or via private 'struct nf_hook_state___flavor' with few fields defined
>> that prog wants to use. CORE will deal with offset adjustments.
>> That's a lot less kernel code. No need for asm style ctx rewrites.
>> Just see how much kernel code we already burned on *convert_ctx_access().
>> We cannot remove this tech debt due to uapi.
>> When you pass struct nf_hook_state directly none of it is needed.
>
> Ok, thanks for pointing that out.  I did not realize
> convert_ctx_access() conversions were frowned upon.
>
> I will pass a known/exposed struct then.
>
> I thought __sk_buff was required for direct packet access, I will look
> at this again.

Kartikeya implemented direct packet access for struct xdp_md passed as a
BTF ID for use in the XDP queueing RFC. You could have a look at that as
a reference for how to do this for an sk_buff as well:

https://git.kernel.org/pub/scm/linux/kernel/git/toke/linux.git/commit/?h=xdp-queueing-07&id=3b4f3caaf59f3b2a7b6b37dfad96b5e42347786a

It does involve a convert_ctx_access() function, though, but for the BTF
ID. Not sure if there's an easier way...

-Toke
diff mbox series

Patch

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 6820649a0d46..fbab6e2b463e 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -87,6 +87,7 @@  typedef unsigned int nf_hookfn(void *priv,
 enum nf_hook_ops_type {
 	NF_HOOK_OP_UNDEFINED,
 	NF_HOOK_OP_NF_TABLES,
+	NF_HOOK_OP_BPF,
 };
 
 struct nf_hook_ops {
diff --git a/include/net/netfilter/nf_hook_bpf.h b/include/net/netfilter/nf_hook_bpf.h
index d0e865a1843a..7014fd986ad9 100644
--- a/include/net/netfilter/nf_hook_bpf.h
+++ b/include/net/netfilter/nf_hook_bpf.h
@@ -1,6 +1,7 @@ 
 /* SPDX-License-Identifier: GPL-2.0 */
 struct bpf_dispatcher;
 struct bpf_prog;
+struct nf_hook_ops;
 
 #if IS_ENABLED(CONFIG_NF_HOOK_BPF)
 struct bpf_prog *nf_hook_bpf_create(const struct nf_hook_entries *n,
@@ -21,3 +22,5 @@  nf_hook_bpf_create(const struct nf_hook_entries *n, struct nf_hook_ops * const *
 
 static inline struct bpf_prog *nf_hook_bpf_get_fallback(void) { return NULL; }
 #endif
+
+int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index bc1a3d232ae4..387944db0228 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -986,6 +986,7 @@  enum bpf_prog_type {
 	BPF_PROG_TYPE_LSM,
 	BPF_PROG_TYPE_SK_LOOKUP,
 	BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */
+	BPF_PROG_TYPE_NETFILTER,
 };
 
 enum bpf_attach_type {
@@ -1033,6 +1034,7 @@  enum bpf_attach_type {
 	BPF_PERF_EVENT,
 	BPF_TRACE_KPROBE_MULTI,
 	BPF_LSM_CGROUP,
+	BPF_NETFILTER,
 	__MAX_BPF_ATTACH_TYPE
 };
 
@@ -1049,6 +1051,7 @@  enum bpf_link_type {
 	BPF_LINK_TYPE_PERF_EVENT = 7,
 	BPF_LINK_TYPE_KPROBE_MULTI = 8,
 	BPF_LINK_TYPE_STRUCT_OPS = 9,
+	BPF_LINK_TYPE_NETFILTER = 10,
 
 	MAX_BPF_LINK_TYPE,
 };
@@ -1538,6 +1541,11 @@  union bpf_attr {
 				 */
 				__u64		cookie;
 			} tracing;
+			struct {
+				__u32		pf;
+				__u32		hooknum;
+				__s32		prio;
+			} netfilter;
 		};
 	} link_create;
 
@@ -6342,6 +6350,11 @@  struct bpf_link_info {
 		struct {
 			__u32 ifindex;
 		} xdp;
+		struct {
+			__u32 pf;
+			__u32 hooknum;
+			__s32 priority;
+		} netfilter;
 	};
 } __attribute__((aligned(8)));
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index a3f969f1aed5..fdfbabdd9222 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -35,6 +35,7 @@ 
 #include <linux/rcupdate_trace.h>
 #include <linux/memcontrol.h>
 #include <linux/trace_events.h>
+#include <net/netfilter/nf_hook_bpf.h>
 
 #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
 			  (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
@@ -2433,6 +2434,7 @@  static bool is_net_admin_prog_type(enum bpf_prog_type prog_type)
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_SOCK_OPS:
 	case BPF_PROG_TYPE_EXT: /* extends any prog */
+	case BPF_PROG_TYPE_NETFILTER:
 		return true;
 	case BPF_PROG_TYPE_CGROUP_SKB:
 		/* always unpriv */
@@ -3452,6 +3454,8 @@  attach_type_to_prog_type(enum bpf_attach_type attach_type)
 		return BPF_PROG_TYPE_XDP;
 	case BPF_LSM_CGROUP:
 		return BPF_PROG_TYPE_LSM;
+	case BPF_NETFILTER:
+		return BPF_PROG_TYPE_NETFILTER;
 	default:
 		return BPF_PROG_TYPE_UNSPEC;
 	}
@@ -4605,6 +4609,9 @@  static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 	case BPF_PROG_TYPE_XDP:
 		ret = bpf_xdp_link_attach(attr, prog);
 		break;
+	case BPF_PROG_TYPE_NETFILTER:
+		ret = bpf_nf_link_attach(attr, prog);
+		break;
 #endif
 	case BPF_PROG_TYPE_PERF_EVENT:
 	case BPF_PROG_TYPE_TRACEPOINT:
diff --git a/net/netfilter/nf_hook_bpf.c b/net/netfilter/nf_hook_bpf.c
index 55bede6e78cd..922f4c85a7ce 100644
--- a/net/netfilter/nf_hook_bpf.c
+++ b/net/netfilter/nf_hook_bpf.c
@@ -648,3 +648,117 @@  void nf_hook_bpf_change_prog_and_release(struct bpf_dispatcher *d, struct bpf_pr
 	if (from && from != fallback_nf_hook_slow)
 		bpf_prog_put(from);
 }
+
+static unsigned int nf_hook_run_bpf(void *bpf_prog, struct sk_buff *skb, const struct nf_hook_state *s)
+{
+	/* BPF_DISPATCHER_FUNC(nf_hook_base)(state, prog->insnsi, prog->bpf_func); */
+
+	pr_info_ratelimited("%s called at hook %d for pf %d\n", __func__, s->hook, s->pf);
+	return NF_ACCEPT;
+}
+
+struct bpf_nf_link {
+	struct bpf_link link;
+	struct nf_hook_ops hook_ops;
+	struct net *net;
+};
+
+static void bpf_nf_link_release(struct bpf_link *link)
+{
+	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link);
+
+	nf_unregister_net_hook(nf_link->net, &nf_link->hook_ops);
+}
+
+static void bpf_nf_link_dealloc(struct bpf_link *link)
+{
+	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link);
+
+	kfree(nf_link);
+}
+
+static int bpf_nf_link_detach(struct bpf_link *link)
+{
+	bpf_nf_link_release(link);
+	return 0;
+}
+
+static void bpf_nf_link_show_info(const struct bpf_link *link,
+				  struct seq_file *seq)
+{
+	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link);
+
+	seq_printf(seq, "pf:\t%u\thooknum:\t%u\tprio:\t%d\n",
+		  nf_link->hook_ops.pf, nf_link->hook_ops.hooknum,
+		  nf_link->hook_ops.priority);
+}
+
+static int bpf_nf_link_fill_link_info(const struct bpf_link *link,
+				      struct bpf_link_info *info)
+{
+	struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link);
+
+	info->netfilter.pf = nf_link->hook_ops.pf;
+	info->netfilter.hooknum = nf_link->hook_ops.hooknum;
+	info->netfilter.priority = nf_link->hook_ops.priority;
+
+	return 0;
+}
+
+static int bpf_nf_link_update(struct bpf_link *link, struct bpf_prog *new_prog,
+			      struct bpf_prog *old_prog)
+{
+	return -EOPNOTSUPP;
+}
+
+static const struct bpf_link_ops bpf_nf_link_lops = {
+	.release = bpf_nf_link_release,
+	.dealloc = bpf_nf_link_dealloc,
+	.detach = bpf_nf_link_detach,
+	.show_fdinfo = bpf_nf_link_show_info,
+	.fill_link_info = bpf_nf_link_fill_link_info,
+	.update_prog = bpf_nf_link_update,
+};
+
+int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
+{
+	struct net *net = current->nsproxy->net_ns;
+	struct bpf_link_primer link_primer;
+	struct bpf_nf_link *link;
+	int err;
+
+	if (attr->link_create.flags)
+		return -EINVAL;
+
+	link = kzalloc(sizeof(*link), GFP_USER);
+	if (!link)
+		return -ENOMEM;
+
+	bpf_link_init(&link->link, BPF_LINK_TYPE_NETFILTER, &bpf_nf_link_lops, prog);
+
+	link->hook_ops.hook = nf_hook_run_bpf;
+	link->hook_ops.hook_ops_type = NF_HOOK_OP_BPF;
+	link->hook_ops.priv = prog;
+
+	link->hook_ops.pf = attr->link_create.netfilter.pf;
+	link->hook_ops.priority = attr->link_create.netfilter.prio;
+	link->hook_ops.hooknum = attr->link_create.netfilter.hooknum;
+
+	link->net = net;
+
+	err = bpf_link_prime(&link->link, &link_primer);
+	if (err)
+		goto out_free;
+
+	err = nf_register_net_hook(net, &link->hook_ops);
+	if (err) {
+		bpf_link_cleanup(&link_primer);
+		goto out_free;
+	}
+
+	return bpf_link_settle(&link_primer);
+
+out_free:
+	kfree(link);
+	return err;
+}