arm64: net: bpf: don't BUG() on large shifts

Message ID	1452015543-6790-1-git-send-email-rabin@rab.in
State	Changes Requested, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: Rabin Vincent <rabin@rab.in> To: davem@davemloft.net Cc: netdev@vger.kernel.org, zlim.lnx@gmail.com, yang.shi@linaro.org, will.deacon@arm.com, catalin.marinas@arm.com, linux-arm-kernel@lists.infradead.org, Rabin Vincent <rabin@rab.in> Subject: [PATCH] arm64: net: bpf: don't BUG() on large shifts Date: Tue, 5 Jan 2016 18:39:03 +0100 Message-Id: <1452015543-6790-1-git-send-email-rabin@rab.in> Sender: netdev-owner@vger.kernel.org Precedence: bulk

Rabin Vincent Jan. 5, 2016, 5:39 p.m. UTC

Attempting to generate UBFM/SBFM instructions with shifts that can't be
encoded in the immediate fields of the opcodes leads to a trigger of a
BUG() in the instruction generation code.  As the ARMv8 ARM says: "The
shift amounts must be in the range 0 to one less than the register width
of the instruction, inclusive."  Make the JIT reject unencodable shifts
instead of crashing.

 ------------[ cut here ]------------
 kernel BUG at arch/arm64/kernel/insn.c:766!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
 CPU: 0 PID: 669 Comm: insmod Not tainted 4.4.0-rc8+ #4
 PC is at aarch64_insn_gen_bitfield+0xcc/0xd4
 LR is at build_body+0x1000/0x2914
 ..
 Call trace:
 [<ffffffc00008c65c>] aarch64_insn_gen_bitfield+0xcc/0xd4
 [<ffffffc000096bfc>] build_body+0x1000/0x2914
 [<ffffffc000098590>] bpf_int_jit_compile+0x7c/0x1b4
 [<ffffffc000130d10>] bpf_prog_select_runtime+0x20/0xcc
 [<ffffffc0004afbac>] bpf_prepare_filter+0x3d8/0x3e8
 [<ffffffc0004afc30>] bpf_prog_create+0x74/0xa4
 [<ffffffbffc3de1d4>] test_bpf_init+0x1d4/0x748 [test_bpf]
 [<ffffffc00008293c>] do_one_initcall+0x90/0x1a8
 [<ffffffc000140c4c>] do_init_module+0x60/0x1c8
 [<ffffffc00011bdcc>] load_module+0x1554/0x1c98
 [<ffffffc00011c62c>] SyS_init_module+0x11c/0x140
 [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28

Signed-off-by: Rabin Vincent <rabin@rab.in>
---
 arch/arm64/net/bpf_jit_comp.c | 7 +++++++
 1 file changed, 7 insertions(+)

Alexei Starovoitov Jan. 5, 2016, 5:55 p.m. UTC | #1

On Tue, Jan 05, 2016 at 06:39:03PM +0100, Rabin Vincent wrote:
> Attempting to generate UBFM/SBFM instructions with shifts that can't be
> encoded in the immediate fields of the opcodes leads to a trigger of a
> BUG() in the instruction generation code.  As the ARMv8 ARM says: "The
> shift amounts must be in the range 0 to one less than the register width
> of the instruction, inclusive."  Make the JIT reject unencodable shifts
> instead of crashing.
> 
>  ------------[ cut here ]------------
>  kernel BUG at arch/arm64/kernel/insn.c:766!
>  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
>  CPU: 0 PID: 669 Comm: insmod Not tainted 4.4.0-rc8+ #4
>  PC is at aarch64_insn_gen_bitfield+0xcc/0xd4
>  LR is at build_body+0x1000/0x2914
>  ..
>  Call trace:
>  [<ffffffc00008c65c>] aarch64_insn_gen_bitfield+0xcc/0xd4
>  [<ffffffc000096bfc>] build_body+0x1000/0x2914
>  [<ffffffc000098590>] bpf_int_jit_compile+0x7c/0x1b4
>  [<ffffffc000130d10>] bpf_prog_select_runtime+0x20/0xcc
>  [<ffffffc0004afbac>] bpf_prepare_filter+0x3d8/0x3e8
>  [<ffffffc0004afc30>] bpf_prog_create+0x74/0xa4
>  [<ffffffbffc3de1d4>] test_bpf_init+0x1d4/0x748 [test_bpf]
>  [<ffffffc00008293c>] do_one_initcall+0x90/0x1a8
>  [<ffffffc000140c4c>] do_init_module+0x60/0x1c8
>  [<ffffffc00011bdcc>] load_module+0x1554/0x1c98
>  [<ffffffc00011c62c>] SyS_init_module+0x11c/0x140
>  [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28
> 
> Signed-off-by: Rabin Vincent <rabin@rab.in>
> ---
>  arch/arm64/net/bpf_jit_comp.c | 7 +++++++
>  1 file changed, 7 insertions(+)

this one is better to be addressed in verifier instead of eBPF JITs.
Please reject it in check_alu_op() instead.
Though this bug is arm64 only and doesn't affect x64, it's better
to reject such invalid programs, since shifts with large constants
can be only be created manually. llvm doesn't generate such things.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rabin Vincent Jan. 6, 2016, 8:31 p.m. UTC | #2

On Tue, Jan 05, 2016 at 09:55:58AM -0800, Alexei Starovoitov wrote:
> this one is better to be addressed in verifier instead of eBPF JITs.
> Please reject it in check_alu_op() instead.

AFAICS the eBPF verifier is not called on the eBPF filters generated by
the BPF->eBPF conversion in net/core/filter.c, so performing this check
only in check_alu_op() will be insufficient.  So I think we'd need to
add this check to bpf_check_classic() too.  Or am I missing something?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alexei Starovoitov Jan. 6, 2016, 10:12 p.m. UTC | #3

On Wed, Jan 06, 2016 at 09:31:27PM +0100, Rabin Vincent wrote:
> On Tue, Jan 05, 2016 at 09:55:58AM -0800, Alexei Starovoitov wrote:
> > this one is better to be addressed in verifier instead of eBPF JITs.
> > Please reject it in check_alu_op() instead.
> 
> AFAICS the eBPF verifier is not called on the eBPF filters generated by
> the BPF->eBPF conversion in net/core/filter.c, so performing this check
> only in check_alu_op() will be insufficient.  So I think we'd need to
> add this check to bpf_check_classic() too.  Or am I missing something?

correct. the check is needed in bpf_check_classic() too and I believe
it's ok to tighten it up in this case, since >32 shift is
invalid/undefined anyway. We can either accept it as nop or K&=31
or error. I think returning error is more user friendly long term, though
there is a small risk of rejecting previously loadable broken programs.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Laight Jan. 7, 2016, 11:07 a.m. UTC | #4

From: Alexei Starovoitov
> Sent: 06 January 2016 22:13
> On Wed, Jan 06, 2016 at 09:31:27PM +0100, Rabin Vincent wrote:
> > On Tue, Jan 05, 2016 at 09:55:58AM -0800, Alexei Starovoitov wrote:
> > > this one is better to be addressed in verifier instead of eBPF JITs.
> > > Please reject it in check_alu_op() instead.
> >
> > AFAICS the eBPF verifier is not called on the eBPF filters generated by
> > the BPF->eBPF conversion in net/core/filter.c, so performing this check
> > only in check_alu_op() will be insufficient.  So I think we'd need to
> > add this check to bpf_check_classic() too.  Or am I missing something?
> 
> correct. the check is needed in bpf_check_classic() too and I believe
> it's ok to tighten it up in this case, since >32 shift is
> invalid/undefined anyway. We can either accept it as nop or K&=31
> or error. I think returning error is more user friendly long term, though
> there is a small risk of rejecting previously loadable broken programs.

Or replace with an assignment of zero?

	David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel Borkmann Jan. 7, 2016, 12:48 p.m. UTC | #5

On 01/07/2016 12:07 PM, David Laight wrote:
> From: Alexei Starovoitov
>> Sent: 06 January 2016 22:13
>> On Wed, Jan 06, 2016 at 09:31:27PM +0100, Rabin Vincent wrote:
>>> On Tue, Jan 05, 2016 at 09:55:58AM -0800, Alexei Starovoitov wrote:
>>>> this one is better to be addressed in verifier instead of eBPF JITs.
>>>> Please reject it in check_alu_op() instead.
>>>
>>> AFAICS the eBPF verifier is not called on the eBPF filters generated by
>>> the BPF->eBPF conversion in net/core/filter.c, so performing this check
>>> only in check_alu_op() will be insufficient.  So I think we'd need to
>>> add this check to bpf_check_classic() too.  Or am I missing something?
>>
>> correct. the check is needed in bpf_check_classic() too and I believe
>> it's ok to tighten it up in this case, since >32 shift is
>> invalid/undefined anyway. We can either accept it as nop or K&=31
>> or error. I think returning error is more user friendly long term, though
>> there is a small risk of rejecting previously loadable broken programs.
>
> Or replace with an assignment of zero?

The question is what is less risky in terms of uabi. To reject such
filters with such K shift vals upfront in verifier, or to just allow
[0, reg_size - 1] values and handle outliers silently. I think both
might be possible, the latter just needs to be clearly specified in
the documentation somewhere. If we go for the latter, then probably
just rewriting that K value as masked one might seem better. Broken
programs might then still be loadable (and still be broken) ... afaik
in case of register (case of shifts with X) with large shift vals
ARM64 is doing 'modulo reg_size' implicitly.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Will Deacon Jan. 8, 2016, 3:44 p.m. UTC | #6

On Tue, Jan 05, 2016 at 06:39:03PM +0100, Rabin Vincent wrote:
> Attempting to generate UBFM/SBFM instructions with shifts that can't be
> encoded in the immediate fields of the opcodes leads to a trigger of a
> BUG() in the instruction generation code.  As the ARMv8 ARM says: "The
> shift amounts must be in the range 0 to one less than the register width
> of the instruction, inclusive."  Make the JIT reject unencodable shifts
> instead of crashing.

I moaned about those BUG_ONs when they were introduced:

  https://lkml.org/lkml/2014/7/17/438

The response then was that the verifier would catch these issues so
there was nothing to worry about. Has something changed so that is no
longer the case? Do we need to consider a different way of rejecting
invalid instructions at the encoding stage rather than bringing down the
kernel?

Will

Rabin Vincent Jan. 8, 2016, 3:58 p.m. UTC | #7

On Thu, Jan 07, 2016 at 01:48:48PM +0100, Daniel Borkmann wrote:
> The question is what is less risky in terms of uabi. To reject such
> filters with such K shift vals upfront in verifier, or to just allow
> [0, reg_size - 1] values and handle outliers silently. I think both
> might be possible, the latter just needs to be clearly specified in
> the documentation somewhere. If we go for the latter, then probably
> just rewriting that K value as masked one might seem better. Broken
> programs might then still be loadable (and still be broken) ... afaik
> in case of register (case of shifts with X) with large shift vals
> ARM64 is doing 'modulo reg_size' implicitly.

The case of what happens with such shifts with X is also already
architecture-specific, even when using the interpreters.  For example, the
following program returns 1 on ARM64 but 0 on ARM.

	BPF_STMT(BPF_LD | BPF_IMM, 1),
	BPF_STMT(BPF_LDX | BPF_IMM, 32),
	BPF_STMT(BPF_ALU | BPF_LSH | BPF_X, 0),
	BPF_STMT(BPF_RET | BPF_A, 0)

To start rejecting large K shifts in the verifier because they are
architecture-specific while continuing to allow the equally
architecture-specific large X shifts (because we can't verify them
statically) would be rather inconsistent.

If it is desired to enforce uniformity across architectures despite the
risk for subtly changing the behaviour of existing programs, then the
desired uniform semantics of these shifts should really be implemented
both for the K and X shifts, which would mean modifiying the interpreter
and the various arch JITs too.

Daniel Borkmann Jan. 8, 2016, 4:44 p.m. UTC | #8

On 01/08/2016 04:58 PM, Rabin Vincent wrote:
> On Thu, Jan 07, 2016 at 01:48:48PM +0100, Daniel Borkmann wrote:
>> The question is what is less risky in terms of uabi. To reject such
>> filters with such K shift vals upfront in verifier, or to just allow
>> [0, reg_size - 1] values and handle outliers silently. I think both
>> might be possible, the latter just needs to be clearly specified in
>> the documentation somewhere. If we go for the latter, then probably
>> just rewriting that K value as masked one might seem better. Broken
>> programs might then still be loadable (and still be broken) ... afaik
>> in case of register (case of shifts with X) with large shift vals
>> ARM64 is doing 'modulo reg_size' implicitly.
>
> The case of what happens with such shifts with X is also already
> architecture-specific, even when using the interpreters.  For example, the
> following program returns 1 on ARM64 but 0 on ARM.
>
> 	BPF_STMT(BPF_LD | BPF_IMM, 1),
> 	BPF_STMT(BPF_LDX | BPF_IMM, 32),
> 	BPF_STMT(BPF_ALU | BPF_LSH | BPF_X, 0),
> 	BPF_STMT(BPF_RET | BPF_A, 0)
>
> To start rejecting large K shifts in the verifier because they are
> architecture-specific while continuing to allow the equally
> architecture-specific large X shifts (because we can't verify them
> statically) would be rather inconsistent.

Hmm, yeah, agree with you that this would be inconsistent. Last time we
actually had this topic [1], I believe the consensus was that such BPF
programs are to be considered "undefined".

In that case, I think just masking the K value silently into its allowed
range for classic and eBPF might be better. It would eventually not be
uniform with regards to shifts where X is involved, but if we consider
it "undefined" behaviour, such uniformity is probably not needed.

> If it is desired to enforce uniformity across architectures despite the
> risk for subtly changing the behaviour of existing programs, then the
> desired uniform semantics of these shifts should really be implemented
> both for the K and X shifts, which would mean modifiying the interpreter
> and the various arch JITs too.

   [1] https://lkml.org/lkml/2015/12/4/148

Alexei Starovoitov Jan. 8, 2016, 7:09 p.m. UTC | #9

On Fri, Jan 08, 2016 at 03:44:23PM +0000, Will Deacon wrote:
> On Tue, Jan 05, 2016 at 06:39:03PM +0100, Rabin Vincent wrote:
> > Attempting to generate UBFM/SBFM instructions with shifts that can't be
> > encoded in the immediate fields of the opcodes leads to a trigger of a
> > BUG() in the instruction generation code.  As the ARMv8 ARM says: "The
> > shift amounts must be in the range 0 to one less than the register width
> > of the instruction, inclusive."  Make the JIT reject unencodable shifts
> > instead of crashing.
> 
> I moaned about those BUG_ONs when they were introduced:
> 
>   https://lkml.org/lkml/2014/7/17/438
> 
> The response then was that the verifier would catch these issues so
> there was nothing to worry about. Has something changed so that is no
> longer the case? Do we need to consider a different way of rejecting
> invalid instructions at the encoding stage rather than bringing down the
> kernel?

that discussion lead to replacement of all BUG_ONs in
arch/arm64/net/bpf_jit_comp.c with pr_err_once(), but looks like
arch/arm64/kernel/insn.c wasn't addressed.
The amount of BUG_ONs there is indeed overkill regardless of what
verifier and other JITs do. btw, x64 JIT doesn't have runtime BUG_ONs.

Alexei Starovoitov Jan. 8, 2016, 7:18 p.m. UTC | #10

On Fri, Jan 08, 2016 at 05:44:42PM +0100, Daniel Borkmann wrote:
> On 01/08/2016 04:58 PM, Rabin Vincent wrote:
> >On Thu, Jan 07, 2016 at 01:48:48PM +0100, Daniel Borkmann wrote:
> >>The question is what is less risky in terms of uabi. To reject such
> >>filters with such K shift vals upfront in verifier, or to just allow
> >>[0, reg_size - 1] values and handle outliers silently. I think both
> >>might be possible, the latter just needs to be clearly specified in
> >>the documentation somewhere. If we go for the latter, then probably
> >>just rewriting that K value as masked one might seem better. Broken
> >>programs might then still be loadable (and still be broken) ... afaik
> >>in case of register (case of shifts with X) with large shift vals
> >>ARM64 is doing 'modulo reg_size' implicitly.
> >
> >The case of what happens with such shifts with X is also already
> >architecture-specific, even when using the interpreters.  For example, the
> >following program returns 1 on ARM64 but 0 on ARM.
> >
> >	BPF_STMT(BPF_LD | BPF_IMM, 1),
> >	BPF_STMT(BPF_LDX | BPF_IMM, 32),
> >	BPF_STMT(BPF_ALU | BPF_LSH | BPF_X, 0),
> >	BPF_STMT(BPF_RET | BPF_A, 0)
> >
> >To start rejecting large K shifts in the verifier because they are
> >architecture-specific while continuing to allow the equally
> >architecture-specific large X shifts (because we can't verify them
> >statically) would be rather inconsistent.
> 
> Hmm, yeah, agree with you that this would be inconsistent. Last time we
> actually had this topic [1], I believe the consensus was that such BPF
> programs are to be considered "undefined".
> 
> In that case, I think just masking the K value silently into its allowed
> range for classic and eBPF might be better. It would eventually not be
> uniform with regards to shifts where X is involved, but if we consider
> it "undefined" behaviour, such uniformity is probably not needed.

I don't think masking the K makes it any more 'consistent'.
As was shown arm and arm64 cpus are inconsistent between themselves
and it's not a job of sw to cover up such things.
imo rejecting broken programs at the load time for both cBPF and eBPF
is better for users, program authors and compiler writers in
the first place. If llvm (due to a bug) generates a shift with constant>32,
I'd prefer verifier to tell me about it instead of silently doing the mask
which likely will be much harder to discover and program going to stay
broken anyway.

Will Deacon Jan. 12, 2016, 5:17 p.m. UTC | #11

On Fri, Jan 08, 2016 at 11:09:44AM -0800, Alexei Starovoitov wrote:
> On Fri, Jan 08, 2016 at 03:44:23PM +0000, Will Deacon wrote:
> > On Tue, Jan 05, 2016 at 06:39:03PM +0100, Rabin Vincent wrote:
> > > Attempting to generate UBFM/SBFM instructions with shifts that can't be
> > > encoded in the immediate fields of the opcodes leads to a trigger of a
> > > BUG() in the instruction generation code.  As the ARMv8 ARM says: "The
> > > shift amounts must be in the range 0 to one less than the register width
> > > of the instruction, inclusive."  Make the JIT reject unencodable shifts
> > > instead of crashing.
> > 
> > I moaned about those BUG_ONs when they were introduced:
> > 
> >   https://lkml.org/lkml/2014/7/17/438
> > 
> > The response then was that the verifier would catch these issues so
> > there was nothing to worry about. Has something changed so that is no
> > longer the case? Do we need to consider a different way of rejecting
> > invalid instructions at the encoding stage rather than bringing down the
> > kernel?
> 
> that discussion lead to replacement of all BUG_ONs in
> arch/arm64/net/bpf_jit_comp.c with pr_err_once(), but looks like
> arch/arm64/kernel/insn.c wasn't addressed.
> The amount of BUG_ONs there is indeed overkill regardless of what
> verifier and other JITs do. btw, x64 JIT doesn't have runtime BUG_ONs.

Maybe, but insn.c is also used by the alternatives patching code, so we
really need a way to communicate failure back to the BPF JIT when passed
an invalid instruction description.

Will

Alexei Starovoitov Jan. 12, 2016, 7:23 p.m. UTC | #12

On Tue, Jan 12, 2016 at 05:17:10PM +0000, Will Deacon wrote:
> On Fri, Jan 08, 2016 at 11:09:44AM -0800, Alexei Starovoitov wrote:
> > On Fri, Jan 08, 2016 at 03:44:23PM +0000, Will Deacon wrote:
> > > On Tue, Jan 05, 2016 at 06:39:03PM +0100, Rabin Vincent wrote:
> > > > Attempting to generate UBFM/SBFM instructions with shifts that can't be
> > > > encoded in the immediate fields of the opcodes leads to a trigger of a
> > > > BUG() in the instruction generation code.  As the ARMv8 ARM says: "The
> > > > shift amounts must be in the range 0 to one less than the register width
> > > > of the instruction, inclusive."  Make the JIT reject unencodable shifts
> > > > instead of crashing.
> > > 
> > > I moaned about those BUG_ONs when they were introduced:
> > > 
> > >   https://lkml.org/lkml/2014/7/17/438
> > > 
> > > The response then was that the verifier would catch these issues so
> > > there was nothing to worry about. Has something changed so that is no
> > > longer the case? Do we need to consider a different way of rejecting
> > > invalid instructions at the encoding stage rather than bringing down the
> > > kernel?
> > 
> > that discussion lead to replacement of all BUG_ONs in
> > arch/arm64/net/bpf_jit_comp.c with pr_err_once(), but looks like
> > arch/arm64/kernel/insn.c wasn't addressed.
> > The amount of BUG_ONs there is indeed overkill regardless of what
> > verifier and other JITs do. btw, x64 JIT doesn't have runtime BUG_ONs.
> 
> Maybe, but insn.c is also used by the alternatives patching code, so we
> really need a way to communicate failure back to the BPF JIT when passed
> an invalid instruction description.

agree. I think there are several options to achieve that after
all BUG_ONs are removed:
- change interface for all insn generating macros to check for
  AARCH64_BREAK_FAULT opcode as error.
  That will require all of emit*() functions in bpf_jit_comp.c to
  be changed to accept/return error.
  Overall that looks like massive change.
- ignore AARCH64_BREAK_FAULT during emit and add another pass after
  all code is generated. If such insn is found in a jited code,
  discard the jit.
  I think that's better option.

Zi, any comments?

Zi Shen Lim Jan. 13, 2016, 4:45 a.m. UTC | #13

On Tue, Jan 12, 2016 at 11:23 AM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Tue, Jan 12, 2016 at 05:17:10PM +0000, Will Deacon wrote:
>> On Fri, Jan 08, 2016 at 11:09:44AM -0800, Alexei Starovoitov wrote:
>> > On Fri, Jan 08, 2016 at 03:44:23PM +0000, Will Deacon wrote:
>> > > On Tue, Jan 05, 2016 at 06:39:03PM +0100, Rabin Vincent wrote:
>> > > > Attempting to generate UBFM/SBFM instructions with shifts that can't be
>> > > > encoded in the immediate fields of the opcodes leads to a trigger of a
>> > > > BUG() in the instruction generation code.  As the ARMv8 ARM says: "The
>> > > > shift amounts must be in the range 0 to one less than the register width
>> > > > of the instruction, inclusive."  Make the JIT reject unencodable shifts
>> > > > instead of crashing.
>> > >
>> > > I moaned about those BUG_ONs when they were introduced:
>> > >
>> > >   https://lkml.org/lkml/2014/7/17/438
>> > >
>> > > The response then was that the verifier would catch these issues so
>> > > there was nothing to worry about. Has something changed so that is no
>> > > longer the case? Do we need to consider a different way of rejecting
>> > > invalid instructions at the encoding stage rather than bringing down the
>> > > kernel?
>> >
>> > that discussion lead to replacement of all BUG_ONs in
>> > arch/arm64/net/bpf_jit_comp.c with pr_err_once(), but looks like
>> > arch/arm64/kernel/insn.c wasn't addressed.
>> > The amount of BUG_ONs there is indeed overkill regardless of what
>> > verifier and other JITs do. btw, x64 JIT doesn't have runtime BUG_ONs.
>>
>> Maybe, but insn.c is also used by the alternatives patching code, so we
>> really need a way to communicate failure back to the BPF JIT when passed
>> an invalid instruction description.
>
> agree. I think there are several options to achieve that after
> all BUG_ONs are removed:
> - change interface for all insn generating macros to check for
>   AARCH64_BREAK_FAULT opcode as error.
>   That will require all of emit*() functions in bpf_jit_comp.c to
>   be changed to accept/return error.
>   Overall that looks like massive change.
> - ignore AARCH64_BREAK_FAULT during emit and add another pass after
>   all code is generated. If such insn is found in a jited code,
>   discard the jit.
>   I think that's better option.
>
> Zi, any comments?
>

Alexei, agreed. Second approach is cleaner. Full disclosure: I did not
look at other callers beyond JIT.

Separately, sounds like there's now preference and consensus to
removing all BUGs and BUG_ONs in insn.c. Did a quick grep of insn.c
and noticed a legacy instance, followed by many introduced around the
same time as JIT, and new additions since.

Will, any thoughts on the following replacement scheme?

BUG_ON() for codegen ==> pr_err(); return AARCH64_BREAK_FAULT;
BUG() for decoding ==> leave as is.
remaining BUG_ON() ==> leave as is.

Will Deacon Jan. 13, 2016, 12:08 p.m. UTC | #14

On Tue, Jan 12, 2016 at 08:45:43PM -0800, Z Lim wrote:
> On Tue, Jan 12, 2016 at 11:23 AM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> > On Tue, Jan 12, 2016 at 05:17:10PM +0000, Will Deacon wrote:
> >> On Fri, Jan 08, 2016 at 11:09:44AM -0800, Alexei Starovoitov wrote:
> >> > On Fri, Jan 08, 2016 at 03:44:23PM +0000, Will Deacon wrote:
> >> > > On Tue, Jan 05, 2016 at 06:39:03PM +0100, Rabin Vincent wrote:
> >> > > > Attempting to generate UBFM/SBFM instructions with shifts that can't be
> >> > > > encoded in the immediate fields of the opcodes leads to a trigger of a
> >> > > > BUG() in the instruction generation code.  As the ARMv8 ARM says: "The
> >> > > > shift amounts must be in the range 0 to one less than the register width
> >> > > > of the instruction, inclusive."  Make the JIT reject unencodable shifts
> >> > > > instead of crashing.
> >> > >
> >> > > I moaned about those BUG_ONs when they were introduced:
> >> > >
> >> > >   https://lkml.org/lkml/2014/7/17/438
> >> > >
> >> > > The response then was that the verifier would catch these issues so
> >> > > there was nothing to worry about. Has something changed so that is no
> >> > > longer the case? Do we need to consider a different way of rejecting
> >> > > invalid instructions at the encoding stage rather than bringing down the
> >> > > kernel?
> >> >
> >> > that discussion lead to replacement of all BUG_ONs in
> >> > arch/arm64/net/bpf_jit_comp.c with pr_err_once(), but looks like
> >> > arch/arm64/kernel/insn.c wasn't addressed.
> >> > The amount of BUG_ONs there is indeed overkill regardless of what
> >> > verifier and other JITs do. btw, x64 JIT doesn't have runtime BUG_ONs.
> >>
> >> Maybe, but insn.c is also used by the alternatives patching code, so we
> >> really need a way to communicate failure back to the BPF JIT when passed
> >> an invalid instruction description.
> >
> > agree. I think there are several options to achieve that after
> > all BUG_ONs are removed:
> > - change interface for all insn generating macros to check for
> >   AARCH64_BREAK_FAULT opcode as error.
> >   That will require all of emit*() functions in bpf_jit_comp.c to
> >   be changed to accept/return error.
> >   Overall that looks like massive change.
> > - ignore AARCH64_BREAK_FAULT during emit and add another pass after
> >   all code is generated. If such insn is found in a jited code,
> >   discard the jit.
> >   I think that's better option.
> >
> > Zi, any comments?
> >
> 
> Alexei, agreed. Second approach is cleaner. Full disclosure: I did not
> look at other callers beyond JIT.
> 
> Separately, sounds like there's now preference and consensus to
> removing all BUGs and BUG_ONs in insn.c. Did a quick grep of insn.c
> and noticed a legacy instance, followed by many introduced around the
> same time as JIT, and new additions since.
> 
> Will, any thoughts on the following replacement scheme?
> 
> BUG_ON() for codegen ==> pr_err(); return AARCH64_BREAK_FAULT;
> BUG() for decoding ==> leave as is.
> remaining BUG_ON() ==> leave as is.

That sounds good to me, thanks.

Will

arm64: net: bpf: don't BUG() on large shifts

Commit Message

Comments

Patch