diff mbox series

[v3,bpf] bpf: introduce BPF_JIT_ALWAYS_ON config

Message ID 20180109180429.1115005-1-ast@kernel.org
State Accepted, archived
Delegated to: BPF Maintainers
Headers show
Series [v3,bpf] bpf: introduce BPF_JIT_ALWAYS_ON config | expand

Commit Message

Alexei Starovoitov Jan. 9, 2018, 6:04 p.m. UTC
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.

A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."

To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64

The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_harden

v2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)

v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
  It will be sent when the trees are merged back to net-next

Considered doing:
  int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 init/Kconfig               |  7 +++++++
 kernel/bpf/core.c          | 19 +++++++++++++++++++
 lib/test_bpf.c             | 11 +++++++----
 net/core/filter.c          |  6 ++----
 net/core/sysctl_net_core.c |  6 ++++++
 net/socket.c               |  9 +++++++++
 6 files changed, 50 insertions(+), 8 deletions(-)

Comments

Daniel Borkmann Jan. 9, 2018, 9:39 p.m. UTC | #1
On 01/09/2018 07:04 PM, Alexei Starovoitov wrote:
> The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
> 
> A quote from goolge project zero blog:
> "At this point, it would normally be necessary to locate gadgets in
> the host kernel code that can be used to actually leak data by reading
> from an attacker-controlled location, shifting and masking the result
> appropriately and then using the result of that as offset to an
> attacker-controlled address for a load. But piecing gadgets together
> and figuring out which ones work in a speculation context seems annoying.
> So instead, we decided to use the eBPF interpreter, which is built into
> the host kernel - while there is no legitimate way to invoke it from inside
> a VM, the presence of the code in the host kernel's text section is sufficient
> to make it usable for the attack, just like with ordinary ROP gadgets."
> 
> To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
> option that removes interpreter from the kernel in favor of JIT-only mode.
> So far eBPF JIT is supported by:
> x64, arm64, arm32, sparc64, s390, powerpc64, mips64
> 
> The start of JITed program is randomized and code page is marked as read-only.
> In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
> 
> v2->v3:
> - move __bpf_prog_ret0 under ifdef (Daniel)
> 
> v1->v2:
> - fix init order, test_bpf and cBPF (Daniel's feedback)
> - fix offloaded bpf (Jakub's feedback)
> - add 'return 0' dummy in case something can invoke prog->bpf_func
> - retarget bpf tree. For bpf-next the patch would need one extra hunk.
>   It will be sent when the trees are merged back to net-next
> 
> Considered doing:
>   int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
> but it seems better to land the patch as-is and in bpf-next remove
> bpf_jit_enable global variable from all JITs, consolidate in one place
> and remove this jit_init() function.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Applied to bpf tree, thanks Alexei!
David Woodhouse Jan. 24, 2018, 10:07 a.m. UTC | #2
On Tue, 2018-01-09 at 22:39 +0100, Daniel Borkmann wrote:
> On 01/09/2018 07:04 PM, Alexei Starovoitov wrote:
> > 
> > The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
> > 
> > A quote from goolge project zero blog:
> > "At this point, it would normally be necessary to locate gadgets in
> > the host kernel code that can be used to actually leak data by reading
> > from an attacker-controlled location, shifting and masking the result
> > appropriately and then using the result of that as offset to an
> > attacker-controlled address for a load. But piecing gadgets together
> > and figuring out which ones work in a speculation context seems annoying.
> > So instead, we decided to use the eBPF interpreter, which is built into
> > the host kernel - while there is no legitimate way to invoke it from inside
> > a VM, the presence of the code in the host kernel's text section is sufficient
> > to make it usable for the attack, just like with ordinary ROP gadgets."
> > 
> > To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
> > option that removes interpreter from the kernel in favor of JIT-only mode.
> > So far eBPF JIT is supported by:
> > x64, arm64, arm32, sparc64, s390, powerpc64, mips64
> > 
> > The start of JITed program is randomized and code page is marked as read-only.
> > In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
> > 
> > v2->v3:
> > - move __bpf_prog_ret0 under ifdef (Daniel)
> > 
> > v1->v2:
> > - fix init order, test_bpf and cBPF (Daniel's feedback)
> > - fix offloaded bpf (Jakub's feedback)
> > - add 'return 0' dummy in case something can invoke prog->bpf_func
> > - retarget bpf tree. For bpf-next the patch would need one extra hunk.
> >   It will be sent when the trees are merged back to net-next
> > 
> > Considered doing:
> >   int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
> > but it seems better to land the patch as-is and in bpf-next remove
> > bpf_jit_enable global variable from all JITs, consolidate in one place
> > and remove this jit_init() function.
> > 
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>
> Applied to bpf tree, thanks Alexei!

For stable too?
Daniel Borkmann Jan. 24, 2018, 10:10 a.m. UTC | #3
On 01/24/2018 11:07 AM, David Woodhouse wrote:
> On Tue, 2018-01-09 at 22:39 +0100, Daniel Borkmann wrote:
>> On 01/09/2018 07:04 PM, Alexei Starovoitov wrote:
>>>
>>> The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
>>>
>>> A quote from goolge project zero blog:
>>> "At this point, it would normally be necessary to locate gadgets in
>>> the host kernel code that can be used to actually leak data by reading
>>> from an attacker-controlled location, shifting and masking the result
>>> appropriately and then using the result of that as offset to an
>>> attacker-controlled address for a load. But piecing gadgets together
>>> and figuring out which ones work in a speculation context seems annoying.
>>> So instead, we decided to use the eBPF interpreter, which is built into
>>> the host kernel - while there is no legitimate way to invoke it from inside
>>> a VM, the presence of the code in the host kernel's text section is sufficient
>>> to make it usable for the attack, just like with ordinary ROP gadgets."
>>>
>>> To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
>>> option that removes interpreter from the kernel in favor of JIT-only mode.
>>> So far eBPF JIT is supported by:
>>> x64, arm64, arm32, sparc64, s390, powerpc64, mips64
>>>
>>> The start of JITed program is randomized and code page is marked as read-only.
>>> In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
>>>
>>> v2->v3:
>>> - move __bpf_prog_ret0 under ifdef (Daniel)
>>>
>>> v1->v2:
>>> - fix init order, test_bpf and cBPF (Daniel's feedback)
>>> - fix offloaded bpf (Jakub's feedback)
>>> - add 'return 0' dummy in case something can invoke prog->bpf_func
>>> - retarget bpf tree. For bpf-next the patch would need one extra hunk.
>>>   It will be sent when the trees are merged back to net-next
>>>
>>> Considered doing:
>>>   int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
>>> but it seems better to land the patch as-is and in bpf-next remove
>>> bpf_jit_enable global variable from all JITs, consolidate in one place
>>> and remove this jit_init() function.
>>>
>>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>>
>> Applied to bpf tree, thanks Alexei!
> 
> For stable too?

Yes, this will go into stable as well; batch of backports will come Thurs/Fri.
Greg Kroah-Hartman Jan. 28, 2018, 2:45 p.m. UTC | #4
On Wed, Jan 24, 2018 at 11:10:50AM +0100, Daniel Borkmann wrote:
> On 01/24/2018 11:07 AM, David Woodhouse wrote:
> > On Tue, 2018-01-09 at 22:39 +0100, Daniel Borkmann wrote:
> >> On 01/09/2018 07:04 PM, Alexei Starovoitov wrote:
> >>>
> >>> The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
> >>>
> >>> A quote from goolge project zero blog:
> >>> "At this point, it would normally be necessary to locate gadgets in
> >>> the host kernel code that can be used to actually leak data by reading
> >>> from an attacker-controlled location, shifting and masking the result
> >>> appropriately and then using the result of that as offset to an
> >>> attacker-controlled address for a load. But piecing gadgets together
> >>> and figuring out which ones work in a speculation context seems annoying.
> >>> So instead, we decided to use the eBPF interpreter, which is built into
> >>> the host kernel - while there is no legitimate way to invoke it from inside
> >>> a VM, the presence of the code in the host kernel's text section is sufficient
> >>> to make it usable for the attack, just like with ordinary ROP gadgets."
> >>>
> >>> To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
> >>> option that removes interpreter from the kernel in favor of JIT-only mode.
> >>> So far eBPF JIT is supported by:
> >>> x64, arm64, arm32, sparc64, s390, powerpc64, mips64
> >>>
> >>> The start of JITed program is randomized and code page is marked as read-only.
> >>> In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
> >>>
> >>> v2->v3:
> >>> - move __bpf_prog_ret0 under ifdef (Daniel)
> >>>
> >>> v1->v2:
> >>> - fix init order, test_bpf and cBPF (Daniel's feedback)
> >>> - fix offloaded bpf (Jakub's feedback)
> >>> - add 'return 0' dummy in case something can invoke prog->bpf_func
> >>> - retarget bpf tree. For bpf-next the patch would need one extra hunk.
> >>>   It will be sent when the trees are merged back to net-next
> >>>
> >>> Considered doing:
> >>>   int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
> >>> but it seems better to land the patch as-is and in bpf-next remove
> >>> bpf_jit_enable global variable from all JITs, consolidate in one place
> >>> and remove this jit_init() function.
> >>>
> >>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> >>
> >> Applied to bpf tree, thanks Alexei!
> > 
> > For stable too?
> 
> Yes, this will go into stable as well; batch of backports will come Thurs/Fri.

Any word on these?  Worse case, a simple list of git commit ids to
backport is all I need.

thanks,

greg k-h
Daniel Borkmann Jan. 28, 2018, 11:40 p.m. UTC | #5
On 01/28/2018 03:45 PM, Greg KH wrote:
> On Wed, Jan 24, 2018 at 11:10:50AM +0100, Daniel Borkmann wrote:
>> On 01/24/2018 11:07 AM, David Woodhouse wrote:
>>> On Tue, 2018-01-09 at 22:39 +0100, Daniel Borkmann wrote:
>>>> On 01/09/2018 07:04 PM, Alexei Starovoitov wrote:
>>>>>
>>>>> The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
>>>>>
>>>>> A quote from goolge project zero blog:
>>>>> "At this point, it would normally be necessary to locate gadgets in
>>>>> the host kernel code that can be used to actually leak data by reading
>>>>> from an attacker-controlled location, shifting and masking the result
>>>>> appropriately and then using the result of that as offset to an
>>>>> attacker-controlled address for a load. But piecing gadgets together
>>>>> and figuring out which ones work in a speculation context seems annoying.
>>>>> So instead, we decided to use the eBPF interpreter, which is built into
>>>>> the host kernel - while there is no legitimate way to invoke it from inside
>>>>> a VM, the presence of the code in the host kernel's text section is sufficient
>>>>> to make it usable for the attack, just like with ordinary ROP gadgets."
>>>>>
>>>>> To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
>>>>> option that removes interpreter from the kernel in favor of JIT-only mode.
>>>>> So far eBPF JIT is supported by:
>>>>> x64, arm64, arm32, sparc64, s390, powerpc64, mips64
>>>>>
>>>>> The start of JITed program is randomized and code page is marked as read-only.
>>>>> In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
>>>>>
>>>>> v2->v3:
>>>>> - move __bpf_prog_ret0 under ifdef (Daniel)
>>>>>
>>>>> v1->v2:
>>>>> - fix init order, test_bpf and cBPF (Daniel's feedback)
>>>>> - fix offloaded bpf (Jakub's feedback)
>>>>> - add 'return 0' dummy in case something can invoke prog->bpf_func
>>>>> - retarget bpf tree. For bpf-next the patch would need one extra hunk.
>>>>>   It will be sent when the trees are merged back to net-next
>>>>>
>>>>> Considered doing:
>>>>>   int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
>>>>> but it seems better to land the patch as-is and in bpf-next remove
>>>>> bpf_jit_enable global variable from all JITs, consolidate in one place
>>>>> and remove this jit_init() function.
>>>>>
>>>>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>>>>
>>>> Applied to bpf tree, thanks Alexei!
>>>
>>> For stable too?
>>
>> Yes, this will go into stable as well; batch of backports will come Thurs/Fri.
> 
> Any word on these?  Worse case, a simple list of git commit ids to
> backport is all I need.

Sorry for the delay! There are various conflicts all over the place, so I had
to backport manually. I just flushed out tested 4.14 batch, I'll see to get 4.9
out hopefully tonight as well, and the rest for 4.4 on Mon.
Greg Kroah-Hartman Jan. 29, 2018, 12:31 p.m. UTC | #6
On Mon, Jan 29, 2018 at 12:40:47AM +0100, Daniel Borkmann wrote:
> On 01/28/2018 03:45 PM, Greg KH wrote:
> > On Wed, Jan 24, 2018 at 11:10:50AM +0100, Daniel Borkmann wrote:
> >> On 01/24/2018 11:07 AM, David Woodhouse wrote:
> >>> On Tue, 2018-01-09 at 22:39 +0100, Daniel Borkmann wrote:
> >>>> On 01/09/2018 07:04 PM, Alexei Starovoitov wrote:
> >>>>>
> >>>>> The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
> >>>>>
> >>>>> A quote from goolge project zero blog:
> >>>>> "At this point, it would normally be necessary to locate gadgets in
> >>>>> the host kernel code that can be used to actually leak data by reading
> >>>>> from an attacker-controlled location, shifting and masking the result
> >>>>> appropriately and then using the result of that as offset to an
> >>>>> attacker-controlled address for a load. But piecing gadgets together
> >>>>> and figuring out which ones work in a speculation context seems annoying.
> >>>>> So instead, we decided to use the eBPF interpreter, which is built into
> >>>>> the host kernel - while there is no legitimate way to invoke it from inside
> >>>>> a VM, the presence of the code in the host kernel's text section is sufficient
> >>>>> to make it usable for the attack, just like with ordinary ROP gadgets."
> >>>>>
> >>>>> To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
> >>>>> option that removes interpreter from the kernel in favor of JIT-only mode.
> >>>>> So far eBPF JIT is supported by:
> >>>>> x64, arm64, arm32, sparc64, s390, powerpc64, mips64
> >>>>>
> >>>>> The start of JITed program is randomized and code page is marked as read-only.
> >>>>> In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
> >>>>>
> >>>>> v2->v3:
> >>>>> - move __bpf_prog_ret0 under ifdef (Daniel)
> >>>>>
> >>>>> v1->v2:
> >>>>> - fix init order, test_bpf and cBPF (Daniel's feedback)
> >>>>> - fix offloaded bpf (Jakub's feedback)
> >>>>> - add 'return 0' dummy in case something can invoke prog->bpf_func
> >>>>> - retarget bpf tree. For bpf-next the patch would need one extra hunk.
> >>>>>   It will be sent when the trees are merged back to net-next
> >>>>>
> >>>>> Considered doing:
> >>>>>   int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
> >>>>> but it seems better to land the patch as-is and in bpf-next remove
> >>>>> bpf_jit_enable global variable from all JITs, consolidate in one place
> >>>>> and remove this jit_init() function.
> >>>>>
> >>>>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> >>>>
> >>>> Applied to bpf tree, thanks Alexei!
> >>>
> >>> For stable too?
> >>
> >> Yes, this will go into stable as well; batch of backports will come Thurs/Fri.
> > 
> > Any word on these?  Worse case, a simple list of git commit ids to
> > backport is all I need.
> 
> Sorry for the delay! There are various conflicts all over the place, so I had
> to backport manually. I just flushed out tested 4.14 batch, I'll see to get 4.9
> out hopefully tonight as well, and the rest for 4.4 on Mon.

Not a problem at all, wanted to make sure I didn't miss them having be
posted somewhere I missed :)

If you need/want help for the 4.4 stuff, just let me know, and I'll be
glad to work on it.

thanks,

greg k-h
Daniel Borkmann Jan. 29, 2018, 3:36 p.m. UTC | #7
On 01/29/2018 12:40 AM, Daniel Borkmann wrote:
> On 01/28/2018 03:45 PM, Greg KH wrote:
>> On Wed, Jan 24, 2018 at 11:10:50AM +0100, Daniel Borkmann wrote:
>>> On 01/24/2018 11:07 AM, David Woodhouse wrote:
>>>> On Tue, 2018-01-09 at 22:39 +0100, Daniel Borkmann wrote:
>>>>> On 01/09/2018 07:04 PM, Alexei Starovoitov wrote:
[...]
>>>>> Applied to bpf tree, thanks Alexei!
>>>>
>>>> For stable too?
>>>
>>> Yes, this will go into stable as well; batch of backports will come Thurs/Fri.
>>
>> Any word on these?  Worse case, a simple list of git commit ids to
>> backport is all I need.
> 
> Sorry for the delay! There are various conflicts all over the place, so I had
> to backport manually. I just flushed out tested 4.14 batch, I'll see to get 4.9
> out hopefully tonight as well, and the rest for 4.4 on Mon.

While 4.14 and 4.9 BPF backports are tested and out since yesterday, and I
saw Greg queued them up (thanks!), it looks like plain 4.4.113 doesn't even
boot on my machine. While I can shortly see the kernel log, my screen turns
black shortly thereafter and nothing reacts anymore. No such problems with
4.9 and 4.14 stables seen. (using x86_64, i7-6600U) Is this a known issue?

Thanks,
Daniel
Greg Kroah-Hartman Jan. 29, 2018, 5:36 p.m. UTC | #8
On Mon, Jan 29, 2018 at 04:36:35PM +0100, Daniel Borkmann wrote:
> On 01/29/2018 12:40 AM, Daniel Borkmann wrote:
> > On 01/28/2018 03:45 PM, Greg KH wrote:
> >> On Wed, Jan 24, 2018 at 11:10:50AM +0100, Daniel Borkmann wrote:
> >>> On 01/24/2018 11:07 AM, David Woodhouse wrote:
> >>>> On Tue, 2018-01-09 at 22:39 +0100, Daniel Borkmann wrote:
> >>>>> On 01/09/2018 07:04 PM, Alexei Starovoitov wrote:
> [...]
> >>>>> Applied to bpf tree, thanks Alexei!
> >>>>
> >>>> For stable too?
> >>>
> >>> Yes, this will go into stable as well; batch of backports will come Thurs/Fri.
> >>
> >> Any word on these?  Worse case, a simple list of git commit ids to
> >> backport is all I need.
> > 
> > Sorry for the delay! There are various conflicts all over the place, so I had
> > to backport manually. I just flushed out tested 4.14 batch, I'll see to get 4.9
> > out hopefully tonight as well, and the rest for 4.4 on Mon.
> 
> While 4.14 and 4.9 BPF backports are tested and out since yesterday, and I
> saw Greg queued them up (thanks!), it looks like plain 4.4.113 doesn't even
> boot on my machine. While I can shortly see the kernel log, my screen turns
> black shortly thereafter and nothing reacts anymore. No such problems with
> 4.9 and 4.14 stables seen. (using x86_64, i7-6600U) Is this a known issue?

Not that I know of, sorry.  Odd graphics issue perhaps?

If you have some test programs I can run, I can look into doing the
backports, I still have a laptop around here that runs 4.4 :)

There's always a virtual machine as well, have you tried that?

thanks,

greg k-h
Daniel Borkmann Jan. 29, 2018, 8:25 p.m. UTC | #9
On 01/29/2018 06:36 PM, Greg KH wrote:
> On Mon, Jan 29, 2018 at 04:36:35PM +0100, Daniel Borkmann wrote:
>> On 01/29/2018 12:40 AM, Daniel Borkmann wrote:
>>> On 01/28/2018 03:45 PM, Greg KH wrote:
>>>> On Wed, Jan 24, 2018 at 11:10:50AM +0100, Daniel Borkmann wrote:
>>>>> On 01/24/2018 11:07 AM, David Woodhouse wrote:
>>>>>> On Tue, 2018-01-09 at 22:39 +0100, Daniel Borkmann wrote:
>>>>>>> On 01/09/2018 07:04 PM, Alexei Starovoitov wrote:
>> [...]
>>>>>>> Applied to bpf tree, thanks Alexei!
>>>>>>
>>>>>> For stable too?
>>>>>
>>>>> Yes, this will go into stable as well; batch of backports will come Thurs/Fri.
>>>>
>>>> Any word on these?  Worse case, a simple list of git commit ids to
>>>> backport is all I need.
>>>
>>> Sorry for the delay! There are various conflicts all over the place, so I had
>>> to backport manually. I just flushed out tested 4.14 batch, I'll see to get 4.9
>>> out hopefully tonight as well, and the rest for 4.4 on Mon.
>>
>> While 4.14 and 4.9 BPF backports are tested and out since yesterday, and I
>> saw Greg queued them up (thanks!), it looks like plain 4.4.113 doesn't even
>> boot on my machine. While I can shortly see the kernel log, my screen turns
>> black shortly thereafter and nothing reacts anymore. No such problems with
>> 4.9 and 4.14 stables seen. (using x86_64, i7-6600U) Is this a known issue?
> 
> Not that I know of, sorry.  Odd graphics issue perhaps?
> 
> If you have some test programs I can run, I can look into doing the
> backports, I still have a laptop around here that runs 4.4 :)
> 
> There's always a virtual machine as well, have you tried that?

I've switched to an arm64 node now, that's working fine with 4.4, so patches
will come later tonight.

Thanks,
Daniel
diff mbox series

Patch

diff --git a/init/Kconfig b/init/Kconfig
index 2934249fba46..5e2a4a391ba9 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1392,6 +1392,13 @@  config BPF_SYSCALL
 	  Enable the bpf() system call that allows to manipulate eBPF
 	  programs and maps via file descriptors.
 
+config BPF_JIT_ALWAYS_ON
+	bool "Permanently enable BPF JIT and remove BPF interpreter"
+	depends on BPF_SYSCALL && HAVE_EBPF_JIT && BPF_JIT
+	help
+	  Enables BPF JIT and removes BPF interpreter to avoid
+	  speculative execution of BPF instructions by the interpreter
+
 config USERFAULTFD
 	bool "Enable userfaultfd() system call"
 	select ANON_INODES
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 86b50aa26ee8..51ec2dda7f08 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -767,6 +767,7 @@  noinline u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
 }
 EXPORT_SYMBOL_GPL(__bpf_call_base);
 
+#ifndef CONFIG_BPF_JIT_ALWAYS_ON
 /**
  *	__bpf_prog_run - run eBPF program on a given context
  *	@ctx: is the data we are operating on
@@ -1317,6 +1318,14 @@  EVAL6(PROG_NAME_LIST, 224, 256, 288, 320, 352, 384)
 EVAL4(PROG_NAME_LIST, 416, 448, 480, 512)
 };
 
+#else
+static unsigned int __bpf_prog_ret0(const void *ctx,
+				    const struct bpf_insn *insn)
+{
+	return 0;
+}
+#endif
+
 bool bpf_prog_array_compatible(struct bpf_array *array,
 			       const struct bpf_prog *fp)
 {
@@ -1364,9 +1373,13 @@  static int bpf_check_tail_call(const struct bpf_prog *fp)
  */
 struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
 {
+#ifndef CONFIG_BPF_JIT_ALWAYS_ON
 	u32 stack_depth = max_t(u32, fp->aux->stack_depth, 1);
 
 	fp->bpf_func = interpreters[(round_up(stack_depth, 32) / 32) - 1];
+#else
+	fp->bpf_func = __bpf_prog_ret0;
+#endif
 
 	/* eBPF JITs can rewrite the program in case constant
 	 * blinding is active. However, in case of error during
@@ -1376,6 +1389,12 @@  struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
 	 */
 	if (!bpf_prog_is_dev_bound(fp->aux)) {
 		fp = bpf_int_jit_compile(fp);
+#ifdef CONFIG_BPF_JIT_ALWAYS_ON
+		if (!fp->jited) {
+			*err = -ENOTSUPP;
+			return fp;
+		}
+#endif
 	} else {
 		*err = bpf_prog_offload_compile(fp);
 		if (*err)
diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 9e9748089270..f369889e521d 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -6250,9 +6250,8 @@  static struct bpf_prog *generate_filter(int which, int *err)
 				return NULL;
 			}
 		}
-		/* We don't expect to fail. */
 		if (*err) {
-			pr_cont("FAIL to attach err=%d len=%d\n",
+			pr_cont("FAIL to prog_create err=%d len=%d\n",
 				*err, fprog.len);
 			return NULL;
 		}
@@ -6276,6 +6275,10 @@  static struct bpf_prog *generate_filter(int which, int *err)
 		 * checks.
 		 */
 		fp = bpf_prog_select_runtime(fp, err);
+		if (*err) {
+			pr_cont("FAIL to select_runtime err=%d\n", *err);
+			return NULL;
+		}
 		break;
 	}
 
@@ -6461,8 +6464,8 @@  static __init int test_bpf(void)
 				pass_cnt++;
 				continue;
 			}
-
-			return err;
+			err_cnt++;
+			continue;
 		}
 
 		pr_cont("jited:%u ", fp->jited);
diff --git a/net/core/filter.c b/net/core/filter.c
index 6a85e67fafce..d339ef170df6 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1054,11 +1054,9 @@  static struct bpf_prog *bpf_migrate_filter(struct bpf_prog *fp)
 		 */
 		goto out_err_free;
 
-	/* We are guaranteed to never error here with cBPF to eBPF
-	 * transitions, since there's no issue with type compatibility
-	 * checks on program arrays.
-	 */
 	fp = bpf_prog_select_runtime(fp, &err);
+	if (err)
+		goto out_err_free;
 
 	kfree(old_prog);
 	return fp;
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index cbc3dde4cfcc..a47ad6cd41c0 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -325,7 +325,13 @@  static struct ctl_table net_core_table[] = {
 		.data		= &bpf_jit_enable,
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
+#ifndef CONFIG_BPF_JIT_ALWAYS_ON
 		.proc_handler	= proc_dointvec
+#else
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &one,
+		.extra2		= &one,
+#endif
 	},
 # ifdef CONFIG_HAVE_EBPF_JIT
 	{
diff --git a/net/socket.c b/net/socket.c
index 05f361faec45..78acd6ce74c7 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2619,6 +2619,15 @@  static int __init sock_init(void)
 
 core_initcall(sock_init);	/* early initcall */
 
+static int __init jit_init(void)
+{
+#ifdef CONFIG_BPF_JIT_ALWAYS_ON
+	bpf_jit_enable = 1;
+#endif
+	return 0;
+}
+pure_initcall(jit_init);
+
 #ifdef CONFIG_PROC_FS
 void socket_seq_show(struct seq_file *seq)
 {