[bpf] bpf: verifier, do_refine_retval_range may clamp umin to 0 incorrectly
diff mbox series

Message ID 157984984270.18622.13529102486040865869.stgit@john-XPS-13-9370
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series
  • [bpf] bpf: verifier, do_refine_retval_range may clamp umin to 0 incorrectly
Related show

Commit Message

John Fastabend Jan. 24, 2020, 7:10 a.m. UTC
do_refine_retval_range() is called to refine return values from specified
helpers, probe_read_str and get_stack at the moment, the reasoning is
because both have a max value as part of their input arguments and
because the helper ensure the return value will not be larger than this
we can set umax and smax values of the return register, r0.

However, the return value is a signed integer so setting umax is incorrect
It leads to further confusion when the do_refine_retval_range() then calls,
__reg_deduce_bounds() which will see a umax value as meaning the value is
unsigned and then assuming it is unsigned set the smin = umin which in this
case results in 'smin = 0' and an 'smax = X' where X is the input argument
from the helper call.

Here are the comments from _reg_deduce_bounds() on why this would be safe
to do.

 /* Learn sign from unsigned bounds.  Signed bounds cross the sign
  * boundary, so we must be careful.
  */
 if ((s64)reg->umax_value >= 0) {
	/* Positive.  We can't learn anything from the smin, but smax
	 * is positive, hence safe.
	 */
	reg->smin_value = reg->umin_value;
	reg->smax_value = reg->umax_value = min_t(u64, reg->smax_value,
						  reg->umax_value);

But now we incorrectly have a return value with type int with the
signed bounds (0,X). Suppose the return value is negative, which is
possible the we have the verifier and reality out of sync. Among other
things this may result in any error handling code being falsely detected
as dead-code and removed. For instance the example below shows using
bpf_probe_read_str() causes the error path to be identified as dead
code and removed.

>From the 'llvm-object -S' dump,

 r2 = 100
 call 45
 if r0 s< 0 goto +4
 r4 = *(u32 *)(r7 + 0)

But from dump xlate

  (b7) r2 = 100
  (85) call bpf_probe_read_compat_str#-96768
  (61) r4 = *(u32 *)(r7 +0)  <-- dropped if goto

Due to verifier state after call being

 R0=inv(id=0,umax_value=100,var_off=(0x0; 0x7f))

To fix omit setting the umax value because its not safe. The only
actual bounds we know is the smax. This results in the correct bounds
(SMIN, X) where X is the max length from the helper. After this the
new verifier state looks like the following after call 45.

R0=inv(id=0,smax_value=100)

Then xlated version no longer removed dead code giving the expected
result,

  (b7) r2 = 100
  (85) call bpf_probe_read_compat_str#-96768
  (c5) if r0 s< 0x0 goto pc+4
  (61) r4 = *(u32 *)(r7 +0)

Note, bpf_probe_read_* calls are root only so we wont hit this case
with non-root bpf users.

Fixes: 849fa50662fbc ("bpf: verifier, refine bounds may clamp umin to 0 incorrectly")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/verifier.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Daniel Borkmann Jan. 24, 2020, 8:36 a.m. UTC | #1
On 1/24/20 8:10 AM, John Fastabend wrote:
> do_refine_retval_range() is called to refine return values from specified
> helpers, probe_read_str and get_stack at the moment, the reasoning is
> because both have a max value as part of their input arguments and
> because the helper ensure the return value will not be larger than this
> we can set umax and smax values of the return register, r0.
> 
> However, the return value is a signed integer so setting umax is incorrect
> It leads to further confusion when the do_refine_retval_range() then calls,
> __reg_deduce_bounds() which will see a umax value as meaning the value is
> unsigned and then assuming it is unsigned set the smin = umin which in this
> case results in 'smin = 0' and an 'smax = X' where X is the input argument
> from the helper call.
> 
> Here are the comments from _reg_deduce_bounds() on why this would be safe
> to do.
> 
>   /* Learn sign from unsigned bounds.  Signed bounds cross the sign
>    * boundary, so we must be careful.
>    */
>   if ((s64)reg->umax_value >= 0) {
> 	/* Positive.  We can't learn anything from the smin, but smax
> 	 * is positive, hence safe.
> 	 */
> 	reg->smin_value = reg->umin_value;
> 	reg->smax_value = reg->umax_value = min_t(u64, reg->smax_value,
> 						  reg->umax_value);
> 
> But now we incorrectly have a return value with type int with the
> signed bounds (0,X). Suppose the return value is negative, which is
> possible the we have the verifier and reality out of sync. Among other
> things this may result in any error handling code being falsely detected
> as dead-code and removed. For instance the example below shows using
> bpf_probe_read_str() causes the error path to be identified as dead
> code and removed.
> 
>>From the 'llvm-object -S' dump,
> 
>   r2 = 100
>   call 45
>   if r0 s< 0 goto +4
>   r4 = *(u32 *)(r7 + 0)
> 
> But from dump xlate
> 
>    (b7) r2 = 100
>    (85) call bpf_probe_read_compat_str#-96768
>    (61) r4 = *(u32 *)(r7 +0)  <-- dropped if goto
> 
> Due to verifier state after call being
> 
>   R0=inv(id=0,umax_value=100,var_off=(0x0; 0x7f))
> 
> To fix omit setting the umax value because its not safe. The only
> actual bounds we know is the smax. This results in the correct bounds
> (SMIN, X) where X is the max length from the helper. After this the
> new verifier state looks like the following after call 45.
> 
> R0=inv(id=0,smax_value=100)
> 
> Then xlated version no longer removed dead code giving the expected
> result,
> 
>    (b7) r2 = 100
>    (85) call bpf_probe_read_compat_str#-96768
>    (c5) if r0 s< 0x0 goto pc+4
>    (61) r4 = *(u32 *)(r7 +0)
> 
> Note, bpf_probe_read_* calls are root only so we wont hit this case
> with non-root bpf users.
> 
> Fixes: 849fa50662fbc ("bpf: verifier, refine bounds may clamp umin to 0 incorrectly")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

Been reviewing this fix internally, therefore also:

Reviewed-by: Daniel Borkmann <daniel@iogearbox.net>

Still waiting to give Yonghong a chance to take a look as well before applying
and getting bpf PR out (should be roughly in morning US time).

Thanks,
Daniel
Yonghong Song Jan. 24, 2020, 4:39 p.m. UTC | #2
On 1/24/20 12:36 AM, Daniel Borkmann wrote:
> On 1/24/20 8:10 AM, John Fastabend wrote:
>> do_refine_retval_range() is called to refine return values from specified
>> helpers, probe_read_str and get_stack at the moment, the reasoning is
>> because both have a max value as part of their input arguments and
>> because the helper ensure the return value will not be larger than this
>> we can set umax and smax values of the return register, r0.
>>
>> However, the return value is a signed integer so setting umax is 
>> incorrect
>> It leads to further confusion when the do_refine_retval_range() then 
>> calls,
>> __reg_deduce_bounds() which will see a umax value as meaning the value is
>> unsigned and then assuming it is unsigned set the smin = umin which in 
>> this
>> case results in 'smin = 0' and an 'smax = X' where X is the input 
>> argument
>> from the helper call.
>>
>> Here are the comments from _reg_deduce_bounds() on why this would be safe
>> to do.
>>
>>   /* Learn sign from unsigned bounds.  Signed bounds cross the sign
>>    * boundary, so we must be careful.
>>    */
>>   if ((s64)reg->umax_value >= 0) {
>>     /* Positive.  We can't learn anything from the smin, but smax
>>      * is positive, hence safe.
>>      */
>>     reg->smin_value = reg->umin_value;
>>     reg->smax_value = reg->umax_value = min_t(u64, reg->smax_value,
>>                           reg->umax_value);
>>
>> But now we incorrectly have a return value with type int with the
>> signed bounds (0,X). Suppose the return value is negative, which is
>> possible the we have the verifier and reality out of sync. Among other
>> things this may result in any error handling code being falsely detected
>> as dead-code and removed. For instance the example below shows using
>> bpf_probe_read_str() causes the error path to be identified as dead
>> code and removed.
>>
>>> From the 'llvm-object -S' dump,
>>
>>   r2 = 100
>>   call 45
>>   if r0 s< 0 goto +4
>>   r4 = *(u32 *)(r7 + 0)
>>
>> But from dump xlate
>>
>>    (b7) r2 = 100
>>    (85) call bpf_probe_read_compat_str#-96768
>>    (61) r4 = *(u32 *)(r7 +0)  <-- dropped if goto
>>
>> Due to verifier state after call being
>>
>>   R0=inv(id=0,umax_value=100,var_off=(0x0; 0x7f))
>>
>> To fix omit setting the umax value because its not safe. The only
>> actual bounds we know is the smax. This results in the correct bounds
>> (SMIN, X) where X is the max length from the helper. After this the
>> new verifier state looks like the following after call 45.
>>
>> R0=inv(id=0,smax_value=100)
>>
>> Then xlated version no longer removed dead code giving the expected
>> result,
>>
>>    (b7) r2 = 100
>>    (85) call bpf_probe_read_compat_str#-96768
>>    (c5) if r0 s< 0x0 goto pc+4
>>    (61) r4 = *(u32 *)(r7 +0)
>>
>> Note, bpf_probe_read_* calls are root only so we wont hit this case
>> with non-root bpf users.
>>
>> Fixes: 849fa50662fbc ("bpf: verifier, refine bounds may clamp umin to 
>> 0 incorrectly")
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> 
> Been reviewing this fix internally, therefore also:
> 
> Reviewed-by: Daniel Borkmann <daniel@iogearbox.net>
> 
> Still waiting to give Yonghong a chance to take a look as well before 
> applying
> and getting bpf PR out (should be roughly in morning US time).

The change makes sense to me. Thanks!
Acked-by: Yonghong Song <yhs@fb.com>
Alexei Starovoitov Jan. 25, 2020, 11:13 a.m. UTC | #3
On Thu, Jan 23, 2020 at 11:10:42PM -0800, John Fastabend wrote:
> @@ -3573,7 +3572,6 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
>  		 * to refine return values.
>  		 */
>  		meta->msize_smax_value = reg->smax_value;
> -		meta->msize_umax_value = reg->umax_value;
>  
>  		/* The register is SCALAR_VALUE; the access check
>  		 * happens using its boundaries.
> @@ -4078,9 +4076,9 @@ static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type,
>  		return;
>  
>  	ret_reg->smax_value = meta->msize_smax_value;
> -	ret_reg->umax_value = meta->msize_umax_value;
>  	__reg_deduce_bounds(ret_reg);
>  	__reg_bound_offset(ret_reg);
> +	__update_reg_bounds(ret_reg);

Thanks a lot for the analysis and the fix.
I think there is still a small problem.
The variable is called msize_smax_value which is used to remember smax,
but the helpers actually use umax and the rest of
if (arg_type_is_mem_size(arg_type)) { ..}
branch is validating [0,umax] range of memory.
bpf_get_stack() and probe_read_str*() have 'u32 size' arguments too.
So doing
meta->msize_smax_value = reg->smax_value;
isn't quite correct.
Also the name is misleading, since the verifier needs to remember
the size 'signed max' doesn't have the right meaning here.
It's just a size. It cannot be 'signed' and cannot be negative.
How about renaming it to just msize_max_value and do
meta->msize_max_value = reg->umax_value;
while later do:
ret_reg->smax_value = meta->msize_max_value;
with a comment that return value from the helpers
is 'int' and not 'unsigned int' while input argument is 'u32 size'.
John Fastabend Jan. 26, 2020, 3:35 a.m. UTC | #4
Alexei Starovoitov wrote:
> On Thu, Jan 23, 2020 at 11:10:42PM -0800, John Fastabend wrote:
> > @@ -3573,7 +3572,6 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
> >  		 * to refine return values.
> >  		 */
> >  		meta->msize_smax_value = reg->smax_value;
> > -		meta->msize_umax_value = reg->umax_value;
> >  
> >  		/* The register is SCALAR_VALUE; the access check
> >  		 * happens using its boundaries.
> > @@ -4078,9 +4076,9 @@ static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type,
> >  		return;
> >  
> >  	ret_reg->smax_value = meta->msize_smax_value;
> > -	ret_reg->umax_value = meta->msize_umax_value;
> >  	__reg_deduce_bounds(ret_reg);
> >  	__reg_bound_offset(ret_reg);
> > +	__update_reg_bounds(ret_reg);
> 
> Thanks a lot for the analysis and the fix.
> I think there is still a small problem.
> The variable is called msize_smax_value which is used to remember smax,
> but the helpers actually use umax and the rest of
> if (arg_type_is_mem_size(arg_type)) { ..}
> branch is validating [0,umax] range of memory.
> bpf_get_stack() and probe_read_str*() have 'u32 size' arguments too.
> So doing
> meta->msize_smax_value = reg->smax_value;
> isn't quite correct.

Agree.

> Also the name is misleading, since the verifier needs to remember
> the size 'signed max' doesn't have the right meaning here.
> It's just a size. It cannot be 'signed' and cannot be negative.
> How about renaming it to just msize_max_value and do
> meta->msize_max_value = reg->umax_value;

msize_max_value reads better and we can give it u64 type so the
"cannot be negative" part is clear.

> while later do:
> ret_reg->smax_value = meta->msize_max_value;
> with a comment that return value from the helpers
> is 'int' and not 'unsigned int' while input argument is 'u32 size'.

Sounds better I'll draft a v2.
John Fastabend Jan. 26, 2020, 10:25 p.m. UTC | #5
John Fastabend wrote:
> Alexei Starovoitov wrote:
> > On Thu, Jan 23, 2020 at 11:10:42PM -0800, John Fastabend wrote:
> > > @@ -3573,7 +3572,6 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
> > >  		 * to refine return values.
> > >  		 */
> > >  		meta->msize_smax_value = reg->smax_value;
> > > -		meta->msize_umax_value = reg->umax_value;
> > >  
> > >  		/* The register is SCALAR_VALUE; the access check
> > >  		 * happens using its boundaries.
> > > @@ -4078,9 +4076,9 @@ static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type,
> > >  		return;
> > >  
> > >  	ret_reg->smax_value = meta->msize_smax_value;
> > > -	ret_reg->umax_value = meta->msize_umax_value;
> > >  	__reg_deduce_bounds(ret_reg);
> > >  	__reg_bound_offset(ret_reg);
> > > +	__update_reg_bounds(ret_reg);
> > 
> > Thanks a lot for the analysis and the fix.
> > I think there is still a small problem.
> > The variable is called msize_smax_value which is used to remember smax,
> > but the helpers actually use umax and the rest of
> > if (arg_type_is_mem_size(arg_type)) { ..}
> > branch is validating [0,umax] range of memory.
> > bpf_get_stack() and probe_read_str*() have 'u32 size' arguments too.
> > So doing
> > meta->msize_smax_value = reg->smax_value;
> > isn't quite correct.
> 
> Agree.
> 
> > Also the name is misleading, since the verifier needs to remember
> > the size 'signed max' doesn't have the right meaning here.
> > It's just a size. It cannot be 'signed' and cannot be negative.
> > How about renaming it to just msize_max_value and do
> > meta->msize_max_value = reg->umax_value;
> 
> msize_max_value reads better and we can give it u64 type so the
> "cannot be negative" part is clear.
> 
> > while later do:
> > ret_reg->smax_value = meta->msize_max_value;
> > with a comment that return value from the helpers
> > is 'int' and not 'unsigned int' while input argument is 'u32 size'.
> 
> Sounds better I'll draft a v2.

v2 on the list now. Thanks!

@Daniel, I dropped your reviewed-by please take another look the change
should be small and functionality was the same but still its not the
same thing you reviewed ;)

@Alexei, Can you check you agree with the "v2 note:" I added at the
bottom of the commit. The gist is it doesn't really matter from
the verifiers point of view if we use umax_value or smax_value for
the msize_max_value because just below setting msize_max_value we
check tnum_is_const and value is positive. That said it reads better
now.

Patch
diff mbox series

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 7d530ce8719d..9f310db68073 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -228,7 +228,6 @@  struct bpf_call_arg_meta {
 	int regno;
 	int access_size;
 	s64 msize_smax_value;
-	u64 msize_umax_value;
 	int ref_obj_id;
 	int func_id;
 	u32 btf_id;
@@ -3573,7 +3572,6 @@  static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 		 * to refine return values.
 		 */
 		meta->msize_smax_value = reg->smax_value;
-		meta->msize_umax_value = reg->umax_value;
 
 		/* The register is SCALAR_VALUE; the access check
 		 * happens using its boundaries.
@@ -4078,9 +4076,9 @@  static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type,
 		return;
 
 	ret_reg->smax_value = meta->msize_smax_value;
-	ret_reg->umax_value = meta->msize_umax_value;
 	__reg_deduce_bounds(ret_reg);
 	__reg_bound_offset(ret_reg);
+	__update_reg_bounds(ret_reg);
 }
 
 static int