Message ID | 20180420221842.742330-5-yhs@fb.com |
---|---|
State | Changes Requested, archived |
Delegated to: | BPF Maintainers |
Headers | show |
Series | bpf: add bpf_get_stack helper | expand |
On Fri, Apr 20, 2018 at 03:18:37PM -0700, Yonghong Song wrote: > When helpers like bpf_get_stack returns an int value > and later on used for arithmetic computation, the LSH and ARSH > operations are often required to get proper sign extension into > 64-bit. For example, without this patch: > 54: R0=inv(id=0,umax_value=800) > 54: (bf) r8 = r0 > 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) > 55: (67) r8 <<= 32 > 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) > 56: (c7) r8 s>>= 32 > 57: R8=inv(id=0) > With this patch: > 54: R0=inv(id=0,umax_value=800) > 54: (bf) r8 = r0 > 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) > 55: (67) r8 <<= 32 > 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) > 56: (c7) r8 s>>= 32 > 57: R8=inv(id=0, umax_value=800,var_off=(0x0; 0x3ff)) > With better range of "R8", later on when "R8" is added to other register, > e.g., a map pointer or scalar-value register, the better register > range can be derived and verifier failure may be avoided. > > In our later example, > ...... > usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); > if (usize < 0) > return 0; > ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); > ...... > Without improving ARSH value range tracking, the register representing > "max_len - usize" will have smin_value equal to S64_MIN and will be > rejected by verifier. > > Signed-off-by: Yonghong Song <yhs@fb.com> > --- > kernel/bpf/verifier.c | 26 ++++++++++++++++++++++++++ > 1 file changed, 26 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 3c8bb92..01c215d 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -2975,6 +2975,32 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, > /* We may learn something more from the var_off */ > __update_reg_bounds(dst_reg); > break; > + case BPF_ARSH: > + if (umax_val >= insn_bitness) { > + /* Shifts greater than 31 or 63 are undefined. > + * This includes shifts by a negative number. > + */ > + mark_reg_unknown(env, regs, insn->dst_reg); > + break; > + } > + if (dst_reg->smin_value < 0) > + dst_reg->smin_value >>= umin_val; > + else > + dst_reg->smin_value >>= umax_val; > + if (dst_reg->smax_value < 0) > + dst_reg->smax_value >>= umax_val; > + else > + dst_reg->smax_value >>= umin_val; > + if (src_known) > + dst_reg->var_off = tnum_rshift(dst_reg->var_off, > + umin_val); > + else > + dst_reg->var_off = tnum_rshift(tnum_unknown, umin_val); > + dst_reg->umin_value >>= umax_val; > + dst_reg->umax_value >>= umin_val; > + /* We may learn something more from the var_off */ > + __update_reg_bounds(dst_reg); I'm struggling to understand how these bounds are computed. Could you add examples in the comments? In particular if dst_reg is unknown (tnum.mask == -1) the above tnum_rshift() will clear upper bits and will make it 64-bit positive, but that doesn't seem correct. What am I missing?
On 4/22/18 5:16 PM, Alexei Starovoitov wrote: > On Fri, Apr 20, 2018 at 03:18:37PM -0700, Yonghong Song wrote: >> When helpers like bpf_get_stack returns an int value >> and later on used for arithmetic computation, the LSH and ARSH >> operations are often required to get proper sign extension into >> 64-bit. For example, without this patch: >> 54: R0=inv(id=0,umax_value=800) >> 54: (bf) r8 = r0 >> 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) >> 55: (67) r8 <<= 32 >> 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) >> 56: (c7) r8 s>>= 32 >> 57: R8=inv(id=0) >> With this patch: >> 54: R0=inv(id=0,umax_value=800) >> 54: (bf) r8 = r0 >> 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) >> 55: (67) r8 <<= 32 >> 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) >> 56: (c7) r8 s>>= 32 >> 57: R8=inv(id=0, umax_value=800,var_off=(0x0; 0x3ff)) >> With better range of "R8", later on when "R8" is added to other register, >> e.g., a map pointer or scalar-value register, the better register >> range can be derived and verifier failure may be avoided. >> >> In our later example, >> ...... >> usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); >> if (usize < 0) >> return 0; >> ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); >> ...... >> Without improving ARSH value range tracking, the register representing >> "max_len - usize" will have smin_value equal to S64_MIN and will be >> rejected by verifier. >> >> Signed-off-by: Yonghong Song <yhs@fb.com> >> --- >> kernel/bpf/verifier.c | 26 ++++++++++++++++++++++++++ >> 1 file changed, 26 insertions(+) >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index 3c8bb92..01c215d 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -2975,6 +2975,32 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, >> /* We may learn something more from the var_off */ >> __update_reg_bounds(dst_reg); >> break; >> + case BPF_ARSH: >> + if (umax_val >= insn_bitness) { >> + /* Shifts greater than 31 or 63 are undefined. >> + * This includes shifts by a negative number. >> + */ >> + mark_reg_unknown(env, regs, insn->dst_reg); >> + break; >> + } >> + if (dst_reg->smin_value < 0) >> + dst_reg->smin_value >>= umin_val; >> + else >> + dst_reg->smin_value >>= umax_val; >> + if (dst_reg->smax_value < 0) >> + dst_reg->smax_value >>= umax_val; >> + else >> + dst_reg->smax_value >>= umin_val; >> + if (src_known) >> + dst_reg->var_off = tnum_rshift(dst_reg->var_off, >> + umin_val); >> + else >> + dst_reg->var_off = tnum_rshift(tnum_unknown, umin_val); >> + dst_reg->umin_value >>= umax_val; >> + dst_reg->umax_value >>= umin_val; >> + /* We may learn something more from the var_off */ >> + __update_reg_bounds(dst_reg); > > I'm struggling to understand how these bounds are computed. > Could you add examples in the comments? Okay, let me try to add some comments for better understanding. > In particular if dst_reg is unknown (tnum.mask == -1) > the above tnum_rshift() will clear upper bits and will make it > 64-bit positive, but that doesn't seem correct. > What am I missing? Considering this is arith shift, we probably should just have dst_reg->var_off = tnum_unknown to be conservative. I could miss something here as well. Let me try to write more detailed explanation, hopefully to cover all corner cases.
On Sun, Apr 22, 2018 at 07:49:13PM -0700, Yonghong Song wrote: > > > On 4/22/18 5:16 PM, Alexei Starovoitov wrote: > > On Fri, Apr 20, 2018 at 03:18:37PM -0700, Yonghong Song wrote: > > > When helpers like bpf_get_stack returns an int value > > > and later on used for arithmetic computation, the LSH and ARSH > > > operations are often required to get proper sign extension into > > > 64-bit. For example, without this patch: > > > 54: R0=inv(id=0,umax_value=800) > > > 54: (bf) r8 = r0 > > > 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) > > > 55: (67) r8 <<= 32 > > > 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) > > > 56: (c7) r8 s>>= 32 > > > 57: R8=inv(id=0) > > > With this patch: > > > 54: R0=inv(id=0,umax_value=800) > > > 54: (bf) r8 = r0 > > > 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) > > > 55: (67) r8 <<= 32 > > > 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) > > > 56: (c7) r8 s>>= 32 > > > 57: R8=inv(id=0, umax_value=800,var_off=(0x0; 0x3ff)) > > > With better range of "R8", later on when "R8" is added to other register, > > > e.g., a map pointer or scalar-value register, the better register > > > range can be derived and verifier failure may be avoided. > > > > > > In our later example, > > > ...... > > > usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); > > > if (usize < 0) > > > return 0; > > > ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); > > > ...... > > > Without improving ARSH value range tracking, the register representing > > > "max_len - usize" will have smin_value equal to S64_MIN and will be > > > rejected by verifier. > > > > > > Signed-off-by: Yonghong Song <yhs@fb.com> > > > --- > > > kernel/bpf/verifier.c | 26 ++++++++++++++++++++++++++ > > > 1 file changed, 26 insertions(+) > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > index 3c8bb92..01c215d 100644 > > > --- a/kernel/bpf/verifier.c > > > +++ b/kernel/bpf/verifier.c > > > @@ -2975,6 +2975,32 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, > > > /* We may learn something more from the var_off */ > > > __update_reg_bounds(dst_reg); > > > break; > > > + case BPF_ARSH: > > > + if (umax_val >= insn_bitness) { > > > + /* Shifts greater than 31 or 63 are undefined. > > > + * This includes shifts by a negative number. > > > + */ > > > + mark_reg_unknown(env, regs, insn->dst_reg); > > > + break; > > > + } > > > + if (dst_reg->smin_value < 0) > > > + dst_reg->smin_value >>= umin_val; > > > + else > > > + dst_reg->smin_value >>= umax_val; > > > + if (dst_reg->smax_value < 0) > > > + dst_reg->smax_value >>= umax_val; > > > + else > > > + dst_reg->smax_value >>= umin_val; > > > + if (src_known) > > > + dst_reg->var_off = tnum_rshift(dst_reg->var_off, > > > + umin_val); > > > + else > > > + dst_reg->var_off = tnum_rshift(tnum_unknown, umin_val); > > > + dst_reg->umin_value >>= umax_val; > > > + dst_reg->umax_value >>= umin_val; > > > + /* We may learn something more from the var_off */ > > > + __update_reg_bounds(dst_reg); > > > > I'm struggling to understand how these bounds are computed. > > Could you add examples in the comments? > > Okay, let me try to add some comments for better understanding. > > > In particular if dst_reg is unknown (tnum.mask == -1) > > the above tnum_rshift() will clear upper bits and will make it > > 64-bit positive, but that doesn't seem correct. > > What am I missing? > > Considering this is arith shift, we probably should just have > dst_reg->var_off = tnum_unknown to be conservative. > > I could miss something here as well. Let me try to write more > detailed explanation, hopefully to cover all corner cases. Is there a use case for !src_known ? I think test_verifier should have 100% line coverage of verifier.c and every 'if' condition in the verifier needs to have real use case behind it. It's still on my todo list to get rid of [su][min|max]_value tracking that was introduced without solid justification.
On 4/22/18 9:19 PM, Alexei Starovoitov wrote: > On Sun, Apr 22, 2018 at 07:49:13PM -0700, Yonghong Song wrote: >> >> >> On 4/22/18 5:16 PM, Alexei Starovoitov wrote: >>> On Fri, Apr 20, 2018 at 03:18:37PM -0700, Yonghong Song wrote: >>>> When helpers like bpf_get_stack returns an int value >>>> and later on used for arithmetic computation, the LSH and ARSH >>>> operations are often required to get proper sign extension into >>>> 64-bit. For example, without this patch: >>>> 54: R0=inv(id=0,umax_value=800) >>>> 54: (bf) r8 = r0 >>>> 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) >>>> 55: (67) r8 <<= 32 >>>> 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) >>>> 56: (c7) r8 s>>= 32 >>>> 57: R8=inv(id=0) >>>> With this patch: >>>> 54: R0=inv(id=0,umax_value=800) >>>> 54: (bf) r8 = r0 >>>> 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) >>>> 55: (67) r8 <<= 32 >>>> 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) >>>> 56: (c7) r8 s>>= 32 >>>> 57: R8=inv(id=0, umax_value=800,var_off=(0x0; 0x3ff)) >>>> With better range of "R8", later on when "R8" is added to other register, >>>> e.g., a map pointer or scalar-value register, the better register >>>> range can be derived and verifier failure may be avoided. >>>> >>>> In our later example, >>>> ...... >>>> usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); >>>> if (usize < 0) >>>> return 0; >>>> ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); >>>> ...... >>>> Without improving ARSH value range tracking, the register representing >>>> "max_len - usize" will have smin_value equal to S64_MIN and will be >>>> rejected by verifier. >>>> >>>> Signed-off-by: Yonghong Song <yhs@fb.com> >>>> --- >>>> kernel/bpf/verifier.c | 26 ++++++++++++++++++++++++++ >>>> 1 file changed, 26 insertions(+) >>>> >>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>>> index 3c8bb92..01c215d 100644 >>>> --- a/kernel/bpf/verifier.c >>>> +++ b/kernel/bpf/verifier.c >>>> @@ -2975,6 +2975,32 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, >>>> /* We may learn something more from the var_off */ >>>> __update_reg_bounds(dst_reg); >>>> break; >>>> + case BPF_ARSH: >>>> + if (umax_val >= insn_bitness) { >>>> + /* Shifts greater than 31 or 63 are undefined. >>>> + * This includes shifts by a negative number. >>>> + */ >>>> + mark_reg_unknown(env, regs, insn->dst_reg); >>>> + break; >>>> + } >>>> + if (dst_reg->smin_value < 0) >>>> + dst_reg->smin_value >>= umin_val; >>>> + else >>>> + dst_reg->smin_value >>= umax_val; >>>> + if (dst_reg->smax_value < 0) >>>> + dst_reg->smax_value >>= umax_val; >>>> + else >>>> + dst_reg->smax_value >>= umin_val; >>>> + if (src_known) >>>> + dst_reg->var_off = tnum_rshift(dst_reg->var_off, >>>> + umin_val); >>>> + else >>>> + dst_reg->var_off = tnum_rshift(tnum_unknown, umin_val); >>>> + dst_reg->umin_value >>= umax_val; >>>> + dst_reg->umax_value >>= umin_val; >>>> + /* We may learn something more from the var_off */ >>>> + __update_reg_bounds(dst_reg); >>> >>> I'm struggling to understand how these bounds are computed. >>> Could you add examples in the comments? >> >> Okay, let me try to add some comments for better understanding. >> >>> In particular if dst_reg is unknown (tnum.mask == -1) >>> the above tnum_rshift() will clear upper bits and will make it >>> 64-bit positive, but that doesn't seem correct. >>> What am I missing? >> >> Considering this is arith shift, we probably should just have >> dst_reg->var_off = tnum_unknown to be conservative. >> >> I could miss something here as well. Let me try to write more >> detailed explanation, hopefully to cover all corner cases. > > Is there a use case for !src_known ? For typical bpf programs, the shift amount should always be known... If src_known is true, it must be dealing custom packets or custom data structures in tracing, etc. > I think test_verifier should have 100% line coverage of verifier.c > and every 'if' condition in the verifier needs to have real use case > behind it. > It's still on my todo list to get rid of [su][min|max]_value tracking > that was introduced without solid justification. >
On Sun, Apr 22, 2018 at 09:31:19PM -0700, Yonghong Song wrote: > > > On 4/22/18 9:19 PM, Alexei Starovoitov wrote: > > On Sun, Apr 22, 2018 at 07:49:13PM -0700, Yonghong Song wrote: > > > > > > > > > On 4/22/18 5:16 PM, Alexei Starovoitov wrote: > > > > On Fri, Apr 20, 2018 at 03:18:37PM -0700, Yonghong Song wrote: > > > > > When helpers like bpf_get_stack returns an int value > > > > > and later on used for arithmetic computation, the LSH and ARSH > > > > > operations are often required to get proper sign extension into > > > > > 64-bit. For example, without this patch: > > > > > 54: R0=inv(id=0,umax_value=800) > > > > > 54: (bf) r8 = r0 > > > > > 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) > > > > > 55: (67) r8 <<= 32 > > > > > 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) > > > > > 56: (c7) r8 s>>= 32 > > > > > 57: R8=inv(id=0) > > > > > With this patch: > > > > > 54: R0=inv(id=0,umax_value=800) > > > > > 54: (bf) r8 = r0 > > > > > 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) > > > > > 55: (67) r8 <<= 32 > > > > > 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) > > > > > 56: (c7) r8 s>>= 32 > > > > > 57: R8=inv(id=0, umax_value=800,var_off=(0x0; 0x3ff)) > > > > > With better range of "R8", later on when "R8" is added to other register, > > > > > e.g., a map pointer or scalar-value register, the better register > > > > > range can be derived and verifier failure may be avoided. > > > > > > > > > > In our later example, > > > > > ...... > > > > > usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); > > > > > if (usize < 0) > > > > > return 0; > > > > > ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); > > > > > ...... > > > > > Without improving ARSH value range tracking, the register representing > > > > > "max_len - usize" will have smin_value equal to S64_MIN and will be > > > > > rejected by verifier. > > > > > > > > > > Signed-off-by: Yonghong Song <yhs@fb.com> > > > > > --- > > > > > kernel/bpf/verifier.c | 26 ++++++++++++++++++++++++++ > > > > > 1 file changed, 26 insertions(+) > > > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > > > index 3c8bb92..01c215d 100644 > > > > > --- a/kernel/bpf/verifier.c > > > > > +++ b/kernel/bpf/verifier.c > > > > > @@ -2975,6 +2975,32 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, > > > > > /* We may learn something more from the var_off */ > > > > > __update_reg_bounds(dst_reg); > > > > > break; > > > > > + case BPF_ARSH: > > > > > + if (umax_val >= insn_bitness) { > > > > > + /* Shifts greater than 31 or 63 are undefined. > > > > > + * This includes shifts by a negative number. > > > > > + */ > > > > > + mark_reg_unknown(env, regs, insn->dst_reg); > > > > > + break; > > > > > + } > > > > > + if (dst_reg->smin_value < 0) > > > > > + dst_reg->smin_value >>= umin_val; > > > > > + else > > > > > + dst_reg->smin_value >>= umax_val; > > > > > + if (dst_reg->smax_value < 0) > > > > > + dst_reg->smax_value >>= umax_val; > > > > > + else > > > > > + dst_reg->smax_value >>= umin_val; > > > > > + if (src_known) > > > > > + dst_reg->var_off = tnum_rshift(dst_reg->var_off, > > > > > + umin_val); > > > > > + else > > > > > + dst_reg->var_off = tnum_rshift(tnum_unknown, umin_val); > > > > > + dst_reg->umin_value >>= umax_val; > > > > > + dst_reg->umax_value >>= umin_val; > > > > > + /* We may learn something more from the var_off */ > > > > > + __update_reg_bounds(dst_reg); > > > > > > > > I'm struggling to understand how these bounds are computed. > > > > Could you add examples in the comments? > > > > > > Okay, let me try to add some comments for better understanding. > > > > > > > In particular if dst_reg is unknown (tnum.mask == -1) > > > > the above tnum_rshift() will clear upper bits and will make it > > > > 64-bit positive, but that doesn't seem correct. > > > > What am I missing? > > > > > > Considering this is arith shift, we probably should just have > > > dst_reg->var_off = tnum_unknown to be conservative. > > > > > > I could miss something here as well. Let me try to write more > > > detailed explanation, hopefully to cover all corner cases. > > > > Is there a use case for !src_known ? > > For typical bpf programs, the shift amount should always be known... > If src_known is true, it must be dealing custom packets or custom > data structures in tracing, etc. In the example it was <<= 32 and s>>= 32 only because newly introduced helper returns signed 32-bit integer that is later used in the math. I have a hard time imagining useful C code that needs arithmetic shift with a variable.
On 20/04/18 23:18, Yonghong Song wrote: > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 3c8bb92..01c215d 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -2975,6 +2975,32 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, > /* We may learn something more from the var_off */ > __update_reg_bounds(dst_reg); > break; > + case BPF_ARSH: > + if (umax_val >= insn_bitness) { > + /* Shifts greater than 31 or 63 are undefined. > + * This includes shifts by a negative number. > + */ > + mark_reg_unknown(env, regs, insn->dst_reg); > + break; > + } > + if (dst_reg->smin_value < 0) > + dst_reg->smin_value >>= umin_val; > + else > + dst_reg->smin_value >>= umax_val; > + if (dst_reg->smax_value < 0) > + dst_reg->smax_value >>= umax_val; > + else > + dst_reg->smax_value >>= umin_val; > + if (src_known) > + dst_reg->var_off = tnum_rshift(dst_reg->var_off, > + umin_val); tnum_rshift is an unsigned shift, it won't do what you want here. I think you could write a tnum_arshift that looks something like this (UNTESTED!): struct tnum tnum_arshift(struct tnum a, u8 shift) { return TNUM(((s64)a.value) >> shift, ((s64)a.mask) >> shift); } Theory: if value sign bit is 1 then number is known negative so populate upper bits with known 1s. If mask sign bit is 1 then number might be negative so populate upper bits with unknown. Otherwise, number is known positive so populate upper bits with known 0s. > + else > + dst_reg->var_off = tnum_rshift(tnum_unknown, umin_val); Applying the above here, tnum_arshift(tnum_unknown, ...) would always just return tnum_unknown, so just do "dst_reg->var_off = tnum_unknown;". The reason for the corresponding logic in the BPF_RSH case is that a right logical shift _always_ populates upper bits with zeroes. In any case these 'else' branches are currently never taken because they fall foul of the check Alexei added just before the switch, if (!src_known && opcode != BPF_ADD && opcode != BPF_SUB && opcode != BPF_AND) { __mark_reg_unknown(dst_reg); return 0; } So I can guarantee you haven't tested this code :-) > + dst_reg->umin_value >>= umax_val; > + dst_reg->umax_value >>= umin_val; FWIW I think the way to handle umin/umax here is to blow them away and just rely on inferring new ubounds from the sbounds (i.e. the inverse of what we do just above in case BPF_RSH) since BPF_ARSH is essentially an operation on the signed value. I don't think there is a need to support cases where the unsigned bounds of a signed shift of a value that may cross the sign boundary at (1<<63) are needed to verify a program. (Unlike in the unsigned shift case, it is at least _possible_ for there to be information from the ubounds that we can't get from the sbounds - but it's a contrived case that isn't likely to be useful in real programs.) -Ed > + /* We may learn something more from the var_off */ > + __update_reg_bounds(dst_reg); > + break; > default: > mark_reg_unknown(env, regs, insn->dst_reg); > break;
On 4/23/18 5:25 AM, Edward Cree wrote: > On 20/04/18 23:18, Yonghong Song wrote: >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index 3c8bb92..01c215d 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -2975,6 +2975,32 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, >> /* We may learn something more from the var_off */ >> __update_reg_bounds(dst_reg); >> break; >> + case BPF_ARSH: >> + if (umax_val >= insn_bitness) { >> + /* Shifts greater than 31 or 63 are undefined. >> + * This includes shifts by a negative number. >> + */ >> + mark_reg_unknown(env, regs, insn->dst_reg); >> + break; >> + } >> + if (dst_reg->smin_value < 0) >> + dst_reg->smin_value >>= umin_val; >> + else >> + dst_reg->smin_value >>= umax_val; >> + if (dst_reg->smax_value < 0) >> + dst_reg->smax_value >>= umax_val; >> + else >> + dst_reg->smax_value >>= umin_val; >> + if (src_known) >> + dst_reg->var_off = tnum_rshift(dst_reg->var_off, >> + umin_val); > tnum_rshift is an unsigned shift, it won't do what you want here. > I think you could write a tnum_arshift that looks something like this > (UNTESTED!): > > struct tnum tnum_arshift(struct tnum a, u8 shift) > { > return TNUM(((s64)a.value) >> shift, ((s64)a.mask) >> shift); > } > Theory: if value sign bit is 1 then number is known negative so populate > upper bits with known 1s. If mask sign bit is 1 then number might be > negative so populate upper bits with unknown. Otherwise, number is > known positive so populate upper bits with known 0s. Right, my last version just used this tnum_arshift :-). >> + else >> + dst_reg->var_off = tnum_rshift(tnum_unknown, umin_val); > Applying the above here, tnum_arshift(tnum_unknown, ...) would always just > return tnum_unknown, so just do "dst_reg->var_off = tnum_unknown;". > The reason for the corresponding logic in the BPF_RSH case is that a right > logical shift _always_ populates upper bits with zeroes. > In any case these 'else' branches are currently never taken because they > fall foul of the check Alexei added just before the switch, > if (!src_known && > opcode != BPF_ADD && opcode != BPF_SUB && opcode != BPF_AND) { > __mark_reg_unknown(dst_reg); > return 0; > } > So I can guarantee you haven't tested this code :-) I just noticed this last night and removed the !src_known branch all together here and from LSH/RSH. > >> + dst_reg->umin_value >>= umax_val; >> + dst_reg->umax_value >>= umin_val; > FWIW I think the way to handle umin/umax here is to blow them away and > just rely on inferring new ubounds from the sbounds (i.e. the inverse of > what we do just above in case BPF_RSH) since BPF_ARSH is essentially an > operation on the signed value. I don't think there is a need to support > cases where the unsigned bounds of a signed shift of a value that may > cross the sign boundary at (1<<63) are needed to verify a program. > (Unlike in the unsigned shift case, it is at least _possible_ for there to > be information from the ubounds that we can't get from the sbounds - but > it's a contrived case that isn't likely to be useful in real programs.) This makes sense and will make code simpler and easy to understand. Will make the change. Thanks! > > -Ed >> + /* We may learn something more from the var_off */ >> + __update_reg_bounds(dst_reg); >> + break; >> default: >> mark_reg_unknown(env, regs, insn->dst_reg); >> break;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 3c8bb92..01c215d 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2975,6 +2975,32 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, /* We may learn something more from the var_off */ __update_reg_bounds(dst_reg); break; + case BPF_ARSH: + if (umax_val >= insn_bitness) { + /* Shifts greater than 31 or 63 are undefined. + * This includes shifts by a negative number. + */ + mark_reg_unknown(env, regs, insn->dst_reg); + break; + } + if (dst_reg->smin_value < 0) + dst_reg->smin_value >>= umin_val; + else + dst_reg->smin_value >>= umax_val; + if (dst_reg->smax_value < 0) + dst_reg->smax_value >>= umax_val; + else + dst_reg->smax_value >>= umin_val; + if (src_known) + dst_reg->var_off = tnum_rshift(dst_reg->var_off, + umin_val); + else + dst_reg->var_off = tnum_rshift(tnum_unknown, umin_val); + dst_reg->umin_value >>= umax_val; + dst_reg->umax_value >>= umin_val; + /* We may learn something more from the var_off */ + __update_reg_bounds(dst_reg); + break; default: mark_reg_unknown(env, regs, insn->dst_reg); break;
When helpers like bpf_get_stack returns an int value and later on used for arithmetic computation, the LSH and ARSH operations are often required to get proper sign extension into 64-bit. For example, without this patch: 54: R0=inv(id=0,umax_value=800) 54: (bf) r8 = r0 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) 55: (67) r8 <<= 32 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) 56: (c7) r8 s>>= 32 57: R8=inv(id=0) With this patch: 54: R0=inv(id=0,umax_value=800) 54: (bf) r8 = r0 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) 55: (67) r8 <<= 32 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) 56: (c7) r8 s>>= 32 57: R8=inv(id=0, umax_value=800,var_off=(0x0; 0x3ff)) With better range of "R8", later on when "R8" is added to other register, e.g., a map pointer or scalar-value register, the better register range can be derived and verifier failure may be avoided. In our later example, ...... usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); if (usize < 0) return 0; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); ...... Without improving ARSH value range tracking, the register representing "max_len - usize" will have smin_value equal to S64_MIN and will be rejected by verifier. Signed-off-by: Yonghong Song <yhs@fb.com> --- kernel/bpf/verifier.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)