diff mbox series

[bpf-next,v6,02/11] bpf: Fix UAF due to race between btf_try_get_module and load_module

Message ID 20220102162115.1506833-3-memxor@gmail.com
State Awaiting Upstream
Delegated to: Pablo Neira
Headers show
Series Introduce unstable CT lookup helpers | expand

Commit Message

Kumar Kartikeya Dwivedi Jan. 2, 2022, 4:21 p.m. UTC
While working on code to populate kfunc BTF ID sets for module BTF from
its initcall, I noticed that by the time the initcall is invoked, the
module BTF can already be seen by userspace (and the BPF verifier). The
existing btf_try_get_module calls try_module_get which only fails if
mod->state == MODULE_STATE_GOING, i.e. it can increment module reference
when module initcall is happening in parallel.

Currently, BTF parsing happens from MODULE_STATE_COMING notifier
callback. At this point, the module initcalls have not been invoked.
The notifier callback parses and prepares the module BTF, allocates an
ID, which publishes it to userspace, and then adds it to the btf_modules
list allowing the kernel to invoke btf_try_get_module for the BTF.

However, at this point, the module has not been fully initialized (i.e.
its initcalls have not finished). The code in module.c can still fail
and free the module, without caring for other users. However, nothing
stops btf_try_get_module from succeeding between the state transition
from MODULE_STATE_COMING to MODULE_STATE_LIVE.

This leads to a use-after-free issue when BPF program loads
successfully in the state transition, load_module's do_init_module call
fails and frees the module, and BPF program fd on close calls module_put
for the freed module. Future patch has test case to verify we don't
regress in this area in future.

There are multiple points after prepare_coming_module (in load_module)
where failure can occur and module loading can return error. We
illustrate and test for the race using the last point where it can
practically occur (in module __init function).

An illustration of the race:

CPU 0                           CPU 1
			  load_module
			    notifier_call(MODULE_STATE_COMING)
			      btf_parse_module
			      btf_alloc_id	// Published to userspace
			      list_add(&btf_mod->list, btf_modules)
			    mod->init(...)
...				^
bpf_check		        |
check_pseudo_btf_id             |
  btf_try_get_module            |
    returns true                |  ...
...                             |  module __init in progress
return prog_fd                  |  ...
...                             V
			    if (ret < 0)
			      free_module(mod)
			    ...
close(prog_fd)
 ...
 bpf_prog_free_deferred
  module_put(used_btf.mod) // UAF

A later selftest patch adds crafts the race condition artifically to
verify it has been fixed, and verifier fails to load program.

Lastly, a couple of comments:

 1. Even if this race didn't exist, it seems more appropriate to only
    access resources (ksyms and kfuncs) of a fully formed module which
    has been initialized completely.

 2. This patch was born out of need for synchronization against module
    initcall for the next patch, so it is needed for correctness even
    without the aforementioned race condition. The BTF resources
    initialized by module initcall are set up once and then only looked
    up, so just waiting until the initcall has finished ensures correct
    behavior.

Fixes: 541c3bad8dc5 ("bpf: Support BPF ksym variables in kernel modules")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/btf.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Alexei Starovoitov Jan. 5, 2022, 6:10 a.m. UTC | #1
On Sun, Jan 02, 2022 at 09:51:06PM +0530, Kumar Kartikeya Dwivedi wrote:
> 
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 33bb8ae4a804..b5b423de53ab 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -6338,7 +6338,10 @@ struct module *btf_try_get_module(const struct btf *btf)
>  		if (btf_mod->btf != btf)
>  			continue;
>  
> -		if (try_module_get(btf_mod->module))
> +		/* We must only consider module whose __init routine has
> +		 * finished, hence use try_module_get_live.
> +		 */
> +		if (try_module_get_live(btf_mod->module))

Instead of patch 1 refactoring for this very specific case can we do:
1.
if (try_module_get(btf_mod->module)) {
     if (btf_mod->module->state != MODULE_STATE_LIVE)
        module_put(btf_mod->module);
     else
        res = btf_mod->module;

2. 
preempt_disable();
if (btf_mod->module->state == MODULE_STATE_LIVE &&
    try_module_get(btf_mod->module)) ...
preempt_enable();

3. add
case MODULE_STATE_LIVE:
to btf_module_notify()
and have an extra flag in struct btf_module to say that it's ready?

I'm mainly concerned about:
-EXPORT_SYMBOL(try_module_get);
+EXPORT_SYMBOL(__try_module_get);
in the patch 1. Not that I care about out of tree modules,
but we shouldn't be breaking them without a reason.
Kumar Kartikeya Dwivedi Jan. 6, 2022, 8:46 a.m. UTC | #2
On Wed, Jan 05, 2022 at 11:40:40AM IST, Alexei Starovoitov wrote:
> On Sun, Jan 02, 2022 at 09:51:06PM +0530, Kumar Kartikeya Dwivedi wrote:
> >
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 33bb8ae4a804..b5b423de53ab 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -6338,7 +6338,10 @@ struct module *btf_try_get_module(const struct btf *btf)
> >  		if (btf_mod->btf != btf)
> >  			continue;
> >
> > -		if (try_module_get(btf_mod->module))
> > +		/* We must only consider module whose __init routine has
> > +		 * finished, hence use try_module_get_live.
> > +		 */
> > +		if (try_module_get_live(btf_mod->module))
>
> Instead of patch 1 refactoring for this very specific case can we do:
> 1.
> if (try_module_get(btf_mod->module)) {
>      if (btf_mod->module->state != MODULE_STATE_LIVE)
>         module_put(btf_mod->module);
>      else
>         res = btf_mod->module;
>
> 2.
> preempt_disable();
> if (btf_mod->module->state == MODULE_STATE_LIVE &&
>     try_module_get(btf_mod->module)) ...
> preempt_enable();
>
> 3. add
> case MODULE_STATE_LIVE:
> to btf_module_notify()
> and have an extra flag in struct btf_module to say that it's ready?
>
> I'm mainly concerned about:
> -EXPORT_SYMBOL(try_module_get);
> +EXPORT_SYMBOL(__try_module_get);
> in the patch 1. Not that I care about out of tree modules,
> but we shouldn't be breaking them without a reason.

Alright, we could also export try_module_get, but let's go with option 3.

--
Kartikeya
diff mbox series

Patch

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 33bb8ae4a804..b5b423de53ab 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6338,7 +6338,10 @@  struct module *btf_try_get_module(const struct btf *btf)
 		if (btf_mod->btf != btf)
 			continue;
 
-		if (try_module_get(btf_mod->module))
+		/* We must only consider module whose __init routine has
+		 * finished, hence use try_module_get_live.
+		 */
+		if (try_module_get_live(btf_mod->module))
 			res = btf_mod->module;
 
 		break;