diff mbox series

[bpf-next,v4,2/6] bpf: introduce BPF dispatcher

Message ID 20191211123017.13212-3-bjorn.topel@gmail.com
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series Introduce the BPF dispatcher | expand

Commit Message

Björn Töpel Dec. 11, 2019, 12:30 p.m. UTC
From: Björn Töpel <bjorn.topel@intel.com>

The BPF dispatcher is a multi-way branch code generator, mainly
targeted for XDP programs. When an XDP program is executed via the
bpf_prog_run_xdp(), it is invoked via an indirect call. The indirect
call has a substantial performance impact, when retpolines are
enabled. The dispatcher transform indirect calls to direct calls, and
therefore avoids the retpoline. The dispatcher is generated using the
BPF JIT, and relies on text poking provided by bpf_arch_text_poke().

The dispatcher hijacks a trampoline function it via the __fentry__ nop
of the trampoline. One dispatcher instance currently supports up to 64
dispatch points. A user creates a dispatcher with its corresponding
trampoline with the DEFINE_BPF_DISPATCHER macro.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 arch/x86/net/bpf_jit_comp.c | 122 +++++++++++++++++++++++++++
 include/linux/bpf.h         |  56 +++++++++++++
 kernel/bpf/Makefile         |   1 +
 kernel/bpf/dispatcher.c     | 159 ++++++++++++++++++++++++++++++++++++
 4 files changed, 338 insertions(+)
 create mode 100644 kernel/bpf/dispatcher.c

Comments

Toke Høiland-Jørgensen Dec. 11, 2019, 1:26 p.m. UTC | #1
[...]
> +/* The BPF dispatcher is a multiway branch code generator. The
> + * dispatcher is a mechanism to avoid the performance penalty of an
> + * indirect call, which is expensive when retpolines are enabled. A
> + * dispatch client registers a BPF program into the dispatcher, and if
> + * there is available room in the dispatcher a direct call to the BPF
> + * program will be generated. All calls to the BPF programs called via
> + * the dispatcher will then be a direct call, instead of an
> + * indirect. The dispatcher hijacks a trampoline function it via the
> + * __fentry__ of the trampoline. The trampoline function has the
> + * following signature:
> + *
> + * unsigned int trampoline(const void *xdp_ctx,
> + *                         const struct bpf_insn *insnsi,
> + *                         unsigned int (*bpf_func)(const void *,
> + *                                                  const struct bpf_insn *));
> + */

Nit: s/xdp_ctx/ctx/

-Toke
Alexei Starovoitov Dec. 13, 2019, 5:30 a.m. UTC | #2
On Wed, Dec 11, 2019 at 01:30:13PM +0100, Björn Töpel wrote:
> +
> +#define DEFINE_BPF_DISPATCHER(name)					\
> +	unsigned int name##func(					\
> +		const void *xdp_ctx,					\
> +		const struct bpf_insn *insnsi,				\
> +		unsigned int (*bpf_func)(const void *,			\
> +					 const struct bpf_insn *))	\
> +	{								\
> +		return bpf_func(xdp_ctx, insnsi);			\
> +	}								\
> +	EXPORT_SYMBOL(name##func);			\
> +	struct bpf_dispatcher name = BPF_DISPATCHER_INIT(name);

The dispatcher function is a normal function. EXPORT_SYMBOL doesn't make it
'noinline'. struct bpf_dispatcher takes a pointer to it and that address is
used for text_poke.

In patch 3 __BPF_PROG_RUN calls dfunc() from two places.
What stops compiler from inlining it? Or partially inlining it in one
or the other place?

I guess it works, because your compiler didn't inline it?
Could you share how asm looks for bpf_prog_run_xdp()
(where __BPF_PROG_RUN is called) and asm for name##func() ?

I hope my guess that compiler didn't inline it is correct. Then extra noinline
will not hurt and that's the only thing needed to avoid the issue.
Björn Töpel Dec. 13, 2019, 7:51 a.m. UTC | #3
On Fri, 13 Dec 2019 at 06:30, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Dec 11, 2019 at 01:30:13PM +0100, Björn Töpel wrote:
> > +
> > +#define DEFINE_BPF_DISPATCHER(name)                                  \
> > +     unsigned int name##func(                                        \
> > +             const void *xdp_ctx,                                    \
> > +             const struct bpf_insn *insnsi,                          \
> > +             unsigned int (*bpf_func)(const void *,                  \
> > +                                      const struct bpf_insn *))      \
> > +     {                                                               \
> > +             return bpf_func(xdp_ctx, insnsi);                       \
> > +     }                                                               \
> > +     EXPORT_SYMBOL(name##func);                      \
> > +     struct bpf_dispatcher name = BPF_DISPATCHER_INIT(name);
>
> The dispatcher function is a normal function. EXPORT_SYMBOL doesn't make it
> 'noinline'. struct bpf_dispatcher takes a pointer to it and that address is
> used for text_poke.
>
> In patch 3 __BPF_PROG_RUN calls dfunc() from two places.
> What stops compiler from inlining it? Or partially inlining it in one
> or the other place?
>

Good catch. No inlining for the XDP dispatcher is possible, since the
trampoline function is in a different compilation unit (filter.o),
than the users of bpf_prog_run_xdp(). Turning on LTO, this would no
longer be true. So, *not* having it marked as noinline is a bug.

> I guess it works, because your compiler didn't inline it?
> Could you share how asm looks for bpf_prog_run_xdp()
> (where __BPF_PROG_RUN is called) and asm for name##func() ?
>

Sure! bpf_prog_run_xdp() is always inlined, so let's look at:
net/bpf/test_run.c:bpf_test_run:

        if (xdp)
            *retval = bpf_prog_run_xdp(prog, ctx);
        else
            *retval = BPF_PROG_RUN(prog, ctx);

translates to:

   0xffffffff8199f522 <+162>:   nopl   0x0(%rax,%rax,1)
./include/linux/filter.h:
716             return __BPF_PROG_RUN(prog, xdp,
   0xffffffff8199f527 <+167>:   mov    0x30(%rbp),%rdx
   0xffffffff8199f52b <+171>:   mov    %r14,%rsi
   0xffffffff8199f52e <+174>:   mov    %r13,%rdi
   0xffffffff8199f531 <+177>:   callq  0xffffffff819586d0
<bpf_dispatcher_xdpfunc>
   0xffffffff8199f536 <+182>:   mov    %eax,%ecx

net/bpf/test_run.c:
48                              *retval = BPF_PROG_RUN(prog, ctx);
   0xffffffff8199f538 <+184>:   mov    (%rsp),%rax
   0xffffffff8199f53c <+188>:   mov    %ecx,(%rax)
...
net/bpf/test_run.c:
45                      if (xdp)
   0xffffffff8199f582 <+258>:   test   %r15b,%r15b
   0xffffffff8199f585 <+261>:   jne    0xffffffff8199f522 <bpf_test_run+162>
   0xffffffff8199f587 <+263>:   nopl   0x0(%rax,%rax,1)

./include/linux/bpf.h:
497             return bpf_func(ctx, insnsi);
   0xffffffff8199f58c <+268>:   mov    0x30(%rbp),%rax
   0xffffffff8199f590 <+272>:   mov    %r14,%rsi
   0xffffffff8199f593 <+275>:   mov    %r13,%rdi
   0xffffffff8199f596 <+278>:   callq  0xffffffff81e00eb0
<__x86_indirect_thunk_rax>
   0xffffffff8199f59b <+283>:   mov    %eax,%ecx
   0xffffffff8199f59d <+285>:   jmp    0xffffffff8199f538 <bpf_test_run+184>

The "dfunc":

net/core/filter.c:
8944    DEFINE_BPF_DISPATCHER(bpf_dispatcher_xdp)
   0xffffffff819586d0 <+0>:     callq  0xffffffff81c01680 <__fentry__>
   0xffffffff819586d5 <+5>:     jmpq   0xffffffff81e00f10
<__x86_indirect_thunk_rdx>


> I hope my guess that compiler didn't inline it is correct. Then extra noinline
> will not hurt and that's the only thing needed to avoid the issue.
>

I'd say it's broken not marking it as noinline, and I was lucky. It
would break if other BPF entrypoints that are being called from
filter.o would appear. I'll wait for more comments, and respin a v5
after the weekend.


Thanks,
Björn
Björn Töpel Dec. 13, 2019, 8:23 a.m. UTC | #4
On Wed, 11 Dec 2019 at 14:26, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> [...]
> > +/* The BPF dispatcher is a multiway branch code generator. The
> > + * dispatcher is a mechanism to avoid the performance penalty of an
> > + * indirect call, which is expensive when retpolines are enabled. A
> > + * dispatch client registers a BPF program into the dispatcher, and if
> > + * there is available room in the dispatcher a direct call to the BPF
> > + * program will be generated. All calls to the BPF programs called via
> > + * the dispatcher will then be a direct call, instead of an
> > + * indirect. The dispatcher hijacks a trampoline function it via the
> > + * __fentry__ of the trampoline. The trampoline function has the
> > + * following signature:
> > + *
> > + * unsigned int trampoline(const void *xdp_ctx,
> > + *                         const struct bpf_insn *insnsi,
> > + *                         unsigned int (*bpf_func)(const void *,
> > + *                                                  const struct bpf_insn *));
> > + */
>
> Nit: s/xdp_ctx/ctx/
>

Thanks! Same type-o in the DEFINE/DECLARE macros. Will fix in v5.


Björn


> -Toke
>
Alexei Starovoitov Dec. 13, 2019, 3:04 p.m. UTC | #5
On Fri, Dec 13, 2019 at 08:51:47AM +0100, Björn Töpel wrote:
> 
> > I hope my guess that compiler didn't inline it is correct. Then extra noinline
> > will not hurt and that's the only thing needed to avoid the issue.
> >
> 
> I'd say it's broken not marking it as noinline, and I was lucky. It
> would break if other BPF entrypoints that are being called from
> filter.o would appear. I'll wait for more comments, and respin a v5
> after the weekend.

Also noticed that EXPORT_SYMBOL for dispatch function is not necessary atm.
Please drop it. It can be added later when need arises.

With that please respin right away. No need to wait till Monday.
My general approach on accepting patches is "perfect is the enemy of the good".
It's better to land patches sooner if architecture and api looks good.
Details and minor bugs can be worked out step by step.
Björn Töpel Dec. 13, 2019, 3:49 p.m. UTC | #6
On Fri, 13 Dec 2019 at 16:04, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Dec 13, 2019 at 08:51:47AM +0100, Björn Töpel wrote:
> >
> > > I hope my guess that compiler didn't inline it is correct. Then extra noinline
> > > will not hurt and that's the only thing needed to avoid the issue.
> > >
> >
> > I'd say it's broken not marking it as noinline, and I was lucky. It
> > would break if other BPF entrypoints that are being called from
> > filter.o would appear. I'll wait for more comments, and respin a v5
> > after the weekend.
>
> Also noticed that EXPORT_SYMBOL for dispatch function is not necessary atm.
> Please drop it. It can be added later when need arises.
>

It's needed for module builds, so I cannot drop it!

> With that please respin right away. No need to wait till Monday.
> My general approach on accepting patches is "perfect is the enemy of the good".
> It's better to land patches sooner if architecture and api looks good.
> Details and minor bugs can be worked out step by step.
>

Ok! Will respin right away!
Alexei Starovoitov Dec. 13, 2019, 3:51 p.m. UTC | #7
On Fri, Dec 13, 2019 at 7:49 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> On Fri, 13 Dec 2019 at 16:04, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Fri, Dec 13, 2019 at 08:51:47AM +0100, Björn Töpel wrote:
> > >
> > > > I hope my guess that compiler didn't inline it is correct. Then extra noinline
> > > > will not hurt and that's the only thing needed to avoid the issue.
> > > >
> > >
> > > I'd say it's broken not marking it as noinline, and I was lucky. It
> > > would break if other BPF entrypoints that are being called from
> > > filter.o would appear. I'll wait for more comments, and respin a v5
> > > after the weekend.
> >
> > Also noticed that EXPORT_SYMBOL for dispatch function is not necessary atm.
> > Please drop it. It can be added later when need arises.
> >
>
> It's needed for module builds, so I cannot drop it!

Not following. Which module it's used out of?
Björn Töpel Dec. 13, 2019, 3:59 p.m. UTC | #8
On Fri, 13 Dec 2019 at 16:52, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Dec 13, 2019 at 7:49 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
> >
> > On Fri, 13 Dec 2019 at 16:04, Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Fri, Dec 13, 2019 at 08:51:47AM +0100, Björn Töpel wrote:
> > > >
> > > > > I hope my guess that compiler didn't inline it is correct. Then extra noinline
> > > > > will not hurt and that's the only thing needed to avoid the issue.
> > > > >
> > > >
> > > > I'd say it's broken not marking it as noinline, and I was lucky. It
> > > > would break if other BPF entrypoints that are being called from
> > > > filter.o would appear. I'll wait for more comments, and respin a v5
> > > > after the weekend.
> > >
> > > Also noticed that EXPORT_SYMBOL for dispatch function is not necessary atm.
> > > Please drop it. It can be added later when need arises.
> > >
> >
> > It's needed for module builds, so I cannot drop it!
>
> Not following. Which module it's used out of?

The trampoline is referenced from bpf_prog_run_xdp(), which is
inlined. Without EXPORT, the symbol is not visible for the module. So,
if, say i40e, is built as a module, you'll get a linker error.
Alexei Starovoitov Dec. 13, 2019, 4:03 p.m. UTC | #9
On Fri, Dec 13, 2019 at 7:59 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> On Fri, 13 Dec 2019 at 16:52, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Fri, Dec 13, 2019 at 7:49 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
> > >
> > > On Fri, 13 Dec 2019 at 16:04, Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Fri, Dec 13, 2019 at 08:51:47AM +0100, Björn Töpel wrote:
> > > > >
> > > > > > I hope my guess that compiler didn't inline it is correct. Then extra noinline
> > > > > > will not hurt and that's the only thing needed to avoid the issue.
> > > > > >
> > > > >
> > > > > I'd say it's broken not marking it as noinline, and I was lucky. It
> > > > > would break if other BPF entrypoints that are being called from
> > > > > filter.o would appear. I'll wait for more comments, and respin a v5
> > > > > after the weekend.
> > > >
> > > > Also noticed that EXPORT_SYMBOL for dispatch function is not necessary atm.
> > > > Please drop it. It can be added later when need arises.
> > > >
> > >
> > > It's needed for module builds, so I cannot drop it!
> >
> > Not following. Which module it's used out of?
>
> The trampoline is referenced from bpf_prog_run_xdp(), which is
> inlined. Without EXPORT, the symbol is not visible for the module. So,
> if, say i40e, is built as a module, you'll get a linker error.

I'm still not following. i40e is not using this dispatch logic any more.
Björn Töpel Dec. 13, 2019, 4:09 p.m. UTC | #10
On Fri, 13 Dec 2019 at 17:03, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Dec 13, 2019 at 7:59 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
> >
> > On Fri, 13 Dec 2019 at 16:52, Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Fri, Dec 13, 2019 at 7:49 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
> > > >
> > > > On Fri, 13 Dec 2019 at 16:04, Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Fri, Dec 13, 2019 at 08:51:47AM +0100, Björn Töpel wrote:
> > > > > >
> > > > > > > I hope my guess that compiler didn't inline it is correct. Then extra noinline
> > > > > > > will not hurt and that's the only thing needed to avoid the issue.
> > > > > > >
> > > > > >
> > > > > > I'd say it's broken not marking it as noinline, and I was lucky. It
> > > > > > would break if other BPF entrypoints that are being called from
> > > > > > filter.o would appear. I'll wait for more comments, and respin a v5
> > > > > > after the weekend.
> > > > >
> > > > > Also noticed that EXPORT_SYMBOL for dispatch function is not necessary atm.
> > > > > Please drop it. It can be added later when need arises.
> > > > >
> > > >
> > > > It's needed for module builds, so I cannot drop it!
> > >
> > > Not following. Which module it's used out of?
> >
> > The trampoline is referenced from bpf_prog_run_xdp(), which is
> > inlined. Without EXPORT, the symbol is not visible for the module. So,
> > if, say i40e, is built as a module, you'll get a linker error.
>
> I'm still not following. i40e is not using this dispatch logic any more.

Hmm, *all* XDP users uses it, but indirectly via bpf_prog_run_xdp().
All drivers that execute an XDP program, does that via the
bpf_prog_run_xdp(), say, i40e_txrx.c and i40e_xsk.c.
bpf_prog_run_xdp() is inlined and expaned to __BPF_PROG_RUN(), which
calls into the dispatcher trampoline.

$ nm drivers/net/ethernet/intel/i40e/i40e_xsk.o|grep xdpfunc
                 U bpf_dispatcher_xdpfunc

Makes sense?
Alexei Starovoitov Dec. 13, 2019, 5:18 p.m. UTC | #11
On Fri, Dec 13, 2019 at 05:09:46PM +0100, Björn Töpel wrote:
> On Fri, 13 Dec 2019 at 17:03, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Fri, Dec 13, 2019 at 7:59 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
> > >
> > > On Fri, 13 Dec 2019 at 16:52, Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Fri, Dec 13, 2019 at 7:49 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
> > > > >
> > > > > On Fri, 13 Dec 2019 at 16:04, Alexei Starovoitov
> > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Dec 13, 2019 at 08:51:47AM +0100, Björn Töpel wrote:
> > > > > > >
> > > > > > > > I hope my guess that compiler didn't inline it is correct. Then extra noinline
> > > > > > > > will not hurt and that's the only thing needed to avoid the issue.
> > > > > > > >
> > > > > > >
> > > > > > > I'd say it's broken not marking it as noinline, and I was lucky. It
> > > > > > > would break if other BPF entrypoints that are being called from
> > > > > > > filter.o would appear. I'll wait for more comments, and respin a v5
> > > > > > > after the weekend.
> > > > > >
> > > > > > Also noticed that EXPORT_SYMBOL for dispatch function is not necessary atm.
> > > > > > Please drop it. It can be added later when need arises.
> > > > > >
> > > > >
> > > > > It's needed for module builds, so I cannot drop it!
> > > >
> > > > Not following. Which module it's used out of?
> > >
> > > The trampoline is referenced from bpf_prog_run_xdp(), which is
> > > inlined. Without EXPORT, the symbol is not visible for the module. So,
> > > if, say i40e, is built as a module, you'll get a linker error.
> >
> > I'm still not following. i40e is not using this dispatch logic any more.
> 
> Hmm, *all* XDP users uses it, but indirectly via bpf_prog_run_xdp().
> All drivers that execute an XDP program, does that via the
> bpf_prog_run_xdp(), say, i40e_txrx.c and i40e_xsk.c.
> bpf_prog_run_xdp() is inlined and expaned to __BPF_PROG_RUN(), which
> calls into the dispatcher trampoline.
> 
> $ nm drivers/net/ethernet/intel/i40e/i40e_xsk.o|grep xdpfunc
>                  U bpf_dispatcher_xdpfunc
> 
> Makes sense?

ahh. got it. bpf_prog_run_xdp() is static inline in .h

Thank you for explaining!
diff mbox series

Patch

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index b8be18427277..3ce7ad41bd6f 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -10,10 +10,12 @@ 
 #include <linux/if_vlan.h>
 #include <linux/bpf.h>
 #include <linux/memory.h>
+#include <linux/sort.h>
 #include <asm/extable.h>
 #include <asm/set_memory.h>
 #include <asm/nospec-branch.h>
 #include <asm/text-patching.h>
+#include <asm/asm-prototypes.h>
 
 static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
 {
@@ -1530,6 +1532,126 @@  int arch_prepare_bpf_trampoline(void *image, struct btf_func_model *m, u32 flags
 	return 0;
 }
 
+static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	s64 offset;
+
+	offset = func - (ip + 2 + 4);
+	if (!is_simm32(offset)) {
+		pr_err("Target %p is out of range\n", func);
+		return -EINVAL;
+	}
+	EMIT2_off32(0x0F, jmp_cond + 0x10, offset);
+	*pprog = prog;
+	return 0;
+}
+
+static int emit_fallback_jump(u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int err = 0;
+
+#ifdef CONFIG_RETPOLINE
+	/* Note that this assumes the the compiler uses external
+	 * thunks for indirect calls. Both clang and GCC use the same
+	 * naming convention for external thunks.
+	 */
+	err = emit_jump(&prog, __x86_indirect_thunk_rdx, prog);
+#else
+	int cnt = 0;
+
+	EMIT2(0xFF, 0xE2);	/* jmp rdx */
+#endif
+	*pprog = prog;
+	return err;
+}
+
+static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs)
+{
+	int pivot, err, jg_bytes = 1, cnt = 0;
+	u8 *jg_reloc, *prog = *pprog;
+	s64 jg_offset;
+
+	if (a == b) {
+		/* Leaf node of recursion, i.e. not a range of indices
+		 * anymore.
+		 */
+		EMIT1(add_1mod(0x48, BPF_REG_3));	/* cmp rdx,func */
+		if (!is_simm32(progs[a]))
+			return -1;
+		EMIT2_off32(0x81, add_1reg(0xF8, BPF_REG_3),
+			    progs[a]);
+		err = emit_cond_near_jump(&prog,	/* je func */
+					  (void *)progs[a], prog,
+					  X86_JE);
+		if (err)
+			return err;
+
+		err = emit_fallback_jump(&prog);	/* jmp thunk/indirect */
+		if (err)
+			return err;
+
+		*pprog = prog;
+		return 0;
+	}
+
+	/* Not a leaf node, so we pivot, and recursively descend into
+	 * the lower and upper ranges.
+	 */
+	pivot = (b - a) / 2;
+	EMIT1(add_1mod(0x48, BPF_REG_3));		/* cmp rdx,func */
+	if (!is_simm32(progs[a + pivot]))
+		return -1;
+	EMIT2_off32(0x81, add_1reg(0xF8, BPF_REG_3), progs[a + pivot]);
+
+	if (pivot > 2) {				/* jg upper_part */
+		/* Require near jump. */
+		jg_bytes = 4;
+		EMIT2_off32(0x0F, X86_JG + 0x10, 0);
+	} else {
+		EMIT2(X86_JG, 0);
+	}
+	jg_reloc = prog;
+
+	err = emit_bpf_dispatcher(&prog, a, a + pivot,	/* emit lower_part */
+				  progs);
+	if (err)
+		return err;
+
+	jg_offset = prog - jg_reloc;
+	emit_code(jg_reloc - jg_bytes, jg_offset, jg_bytes);
+
+	err = emit_bpf_dispatcher(&prog, a + pivot + 1,	/* emit upper_part */
+				  b, progs);
+	if (err)
+		return err;
+
+	*pprog = prog;
+	return 0;
+}
+
+static int cmp_ips(const void *a, const void *b)
+{
+	const s64 *ipa = a;
+	const s64 *ipb = b;
+
+	if (*ipa > *ipb)
+		return 1;
+	if (*ipa < *ipb)
+		return -1;
+	return 0;
+}
+
+int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs)
+{
+	u8 *prog = image;
+
+	sort(funcs, num_funcs, sizeof(funcs[0]), cmp_ips, NULL);
+	return emit_bpf_dispatcher(&prog, 0, num_funcs - 1, funcs);
+}
+
 struct x64_jit_data {
 	struct bpf_binary_header *header;
 	int *addrs;
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 5d744828b399..e6a9d74d4e30 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -470,12 +470,61 @@  struct bpf_trampoline {
 	void *image;
 	u64 selector;
 };
+
+#define BPF_DISPATCHER_MAX 64 /* Fits in 2048B */
+
+struct bpf_dispatcher_prog {
+	struct bpf_prog *prog;
+	refcount_t users;
+};
+
+struct bpf_dispatcher {
+	/* dispatcher mutex */
+	struct mutex mutex;
+	void *func;
+	struct bpf_dispatcher_prog progs[BPF_DISPATCHER_MAX];
+	int num_progs;
+	void *image;
+	u32 image_off;
+};
+
 #ifdef CONFIG_BPF_JIT
 struct bpf_trampoline *bpf_trampoline_lookup(u64 key);
 int bpf_trampoline_link_prog(struct bpf_prog *prog);
 int bpf_trampoline_unlink_prog(struct bpf_prog *prog);
 void bpf_trampoline_put(struct bpf_trampoline *tr);
 void *bpf_jit_alloc_exec_page(void);
+#define BPF_DISPATCHER_INIT(name) {			\
+	.mutex = __MUTEX_INITIALIZER(name.mutex),	\
+	.func = &name##func,				\
+	.progs = {},					\
+	.num_progs = 0,					\
+	.image = NULL,					\
+	.image_off = 0					\
+}
+
+#define DEFINE_BPF_DISPATCHER(name)					\
+	unsigned int name##func(					\
+		const void *xdp_ctx,					\
+		const struct bpf_insn *insnsi,				\
+		unsigned int (*bpf_func)(const void *,			\
+					 const struct bpf_insn *))	\
+	{								\
+		return bpf_func(xdp_ctx, insnsi);			\
+	}								\
+	EXPORT_SYMBOL(name##func);			\
+	struct bpf_dispatcher name = BPF_DISPATCHER_INIT(name);
+#define DECLARE_BPF_DISPATCHER(name)					\
+	unsigned int name##func(					\
+		const void *xdp_ctx,					\
+		const struct bpf_insn *insnsi,				\
+		unsigned int (*bpf_func)(const void *,			\
+					 const struct bpf_insn *));	\
+	extern struct bpf_dispatcher name;
+#define BPF_DISPATCHER_FUNC(name) name##func
+#define BPF_DISPATCHER_PTR(name) (&name)
+void bpf_dispatcher_change_prog(struct bpf_dispatcher *d, struct bpf_prog *from,
+				struct bpf_prog *to);
 #else
 static inline struct bpf_trampoline *bpf_trampoline_lookup(u64 key)
 {
@@ -490,6 +539,13 @@  static inline int bpf_trampoline_unlink_prog(struct bpf_prog *prog)
 	return -ENOTSUPP;
 }
 static inline void bpf_trampoline_put(struct bpf_trampoline *tr) {}
+#define DEFINE_BPF_DISPATCHER(name)
+#define DECLARE_BPF_DISPATCHER(name)
+#define BPF_DISPATCHER_FUNC(name) bpf_dispatcher_nopfunc
+#define BPF_DISPATCHER_PTR(name) NULL
+static inline void bpf_dispatcher_change_prog(struct bpf_dispatcher *d,
+					      struct bpf_prog *from,
+					      struct bpf_prog *to) {}
 #endif
 
 struct bpf_func_info_aux {
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 3f671bf617e8..d4f330351f87 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -8,6 +8,7 @@  obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o
 obj-$(CONFIG_BPF_SYSCALL) += disasm.o
 obj-$(CONFIG_BPF_JIT) += trampoline.o
 obj-$(CONFIG_BPF_SYSCALL) += btf.o
+obj-$(CONFIG_BPF_JIT) += dispatcher.o
 ifeq ($(CONFIG_NET),y)
 obj-$(CONFIG_BPF_SYSCALL) += devmap.o
 obj-$(CONFIG_BPF_SYSCALL) += cpumap.o
diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c
new file mode 100644
index 000000000000..a4690460d815
--- /dev/null
+++ b/kernel/bpf/dispatcher.c
@@ -0,0 +1,159 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2019 Intel Corporation. */
+
+#include <linux/hash.h>
+#include <linux/bpf.h>
+#include <linux/filter.h>
+
+/* The BPF dispatcher is a multiway branch code generator. The
+ * dispatcher is a mechanism to avoid the performance penalty of an
+ * indirect call, which is expensive when retpolines are enabled. A
+ * dispatch client registers a BPF program into the dispatcher, and if
+ * there is available room in the dispatcher a direct call to the BPF
+ * program will be generated. All calls to the BPF programs called via
+ * the dispatcher will then be a direct call, instead of an
+ * indirect. The dispatcher hijacks a trampoline function it via the
+ * __fentry__ of the trampoline. The trampoline function has the
+ * following signature:
+ *
+ * unsigned int trampoline(const void *xdp_ctx,
+ *                         const struct bpf_insn *insnsi,
+ *                         unsigned int (*bpf_func)(const void *,
+ *                                                  const struct bpf_insn *));
+ */
+
+static struct bpf_dispatcher_prog *bpf_dispatcher_find_prog(
+	struct bpf_dispatcher *d, struct bpf_prog *prog)
+{
+	int i;
+
+	for (i = 0; i < BPF_DISPATCHER_MAX; i++) {
+		if (prog == d->progs[i].prog)
+			return &d->progs[i];
+	}
+	return NULL;
+}
+
+static struct bpf_dispatcher_prog *bpf_dispatcher_find_free(
+	struct bpf_dispatcher *d)
+{
+	return bpf_dispatcher_find_prog(d, NULL);
+}
+
+static bool bpf_dispatcher_add_prog(struct bpf_dispatcher *d,
+				    struct bpf_prog *prog)
+{
+	struct bpf_dispatcher_prog *entry;
+
+	if (!prog)
+		return false;
+
+	entry = bpf_dispatcher_find_prog(d, prog);
+	if (entry) {
+		refcount_inc(&entry->users);
+		return false;
+	}
+
+	entry = bpf_dispatcher_find_free(d);
+	if (!entry)
+		return false;
+
+	bpf_prog_inc(prog);
+	entry->prog = prog;
+	refcount_set(&entry->users, 1);
+	d->num_progs++;
+	return true;
+}
+
+static bool bpf_dispatcher_remove_prog(struct bpf_dispatcher *d,
+				       struct bpf_prog *prog)
+{
+	struct bpf_dispatcher_prog *entry;
+
+	if (!prog)
+		return false;
+
+	entry = bpf_dispatcher_find_prog(d, prog);
+	if (!entry)
+		return false;
+
+	if (refcount_dec_and_test(&entry->users)) {
+		entry->prog = NULL;
+		bpf_prog_put(prog);
+		d->num_progs--;
+		return true;
+	}
+	return false;
+}
+
+int __weak arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs)
+{
+	return -ENOTSUPP;
+}
+
+static int bpf_dispatcher_prepare(struct bpf_dispatcher *d, void *image)
+{
+	s64 ips[BPF_DISPATCHER_MAX] = {}, *ipsp = &ips[0];
+	int i;
+
+	for (i = 0; i < BPF_DISPATCHER_MAX; i++) {
+		if (d->progs[i].prog)
+			*ipsp++ = (s64)(uintptr_t)d->progs[i].prog->bpf_func;
+	}
+	return arch_prepare_bpf_dispatcher(image, &ips[0], d->num_progs);
+}
+
+static void bpf_dispatcher_update(struct bpf_dispatcher *d, int prev_num_progs)
+{
+	void *old, *new;
+	u32 noff;
+	int err;
+
+	if (!prev_num_progs) {
+		old = NULL;
+		noff = 0;
+	} else {
+		old = d->image + d->image_off;
+		noff = d->image_off ^ (PAGE_SIZE / 2);
+	}
+
+	new = d->num_progs ? d->image + noff : NULL;
+	if (new) {
+		if (bpf_dispatcher_prepare(d, new))
+			return;
+	}
+
+	err = bpf_arch_text_poke(d->func, BPF_MOD_JUMP, old, new);
+	if (err || !new)
+		return;
+
+	d->image_off = noff;
+}
+
+void bpf_dispatcher_change_prog(struct bpf_dispatcher *d, struct bpf_prog *from,
+				struct bpf_prog *to)
+{
+	bool changed = false;
+	int prev_num_progs;
+
+	if (from == to)
+		return;
+
+	mutex_lock(&d->mutex);
+	if (!d->image) {
+		d->image = bpf_jit_alloc_exec_page();
+		if (!d->image)
+			goto out;
+	}
+
+	prev_num_progs = d->num_progs;
+	changed |= bpf_dispatcher_remove_prog(d, from);
+	changed |= bpf_dispatcher_add_prog(d, to);
+
+	if (!changed)
+		goto out;
+
+	bpf_dispatcher_update(d, prev_num_progs);
+out:
+	mutex_unlock(&d->mutex);
+}